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To Ben and Kate 'the cool.’ 



Strong stuff, isn’t it? He paused. Limits, limits everywhere. 

William Boyd, The New Confessions 



Introduction 


T his book is written for anyone who has an interest in mathematics. It occupies 
territory that lies midway between a popular science book and a traditional 
textbook. Its subject matter is the part of mathematics that is called analysis. This 
is a very rich branch of mathematics that is also relatively young in historical 
terms. It was first developed in the nineteenth century but it is still very much 
alive as an area of contemporary research. Analysis is typically first introduced in 
undergraduate mathematics degree courses as providing a ‘rigorous’ (i.e. logically 
flawless) foundation to a historically older branch of mathematics called the 
calculus. Calculus is the mathematics of motion and change. It evolved from the 
work of Isaac Newton and Gottfried Leibniz in the seventeenth century to become 
one of the most important tools in applied mathematics. Now the relationship 
between analysis and calculus is extremely important but it is not the subject 
matter of this book. Indeed readers do not need to know any calculus at all to read 
most of it. 

So what is this book about? In a sense it is about two concepts - number and 
limit. Analysis provides the tools for understanding what numbers really are. It 
helps us make sense of the infinitesimally small and the infinitely large as well as 
the boundless realms between them. It achieves this by means of a key concept - 
the limit - which is one of the most subtle and exquisitely beautiful ideas ever 
conceived by humanity. This book is designed to gently guide the reader through 
the lore of this concept so that it becomes a friend. 

So who is this book for? I envisage readers as falling into one of three (not 
necessarily disjoint) categories: 

• The curious. You may have read a popular book on mathematics by a 
masterful expositor such as Marcus de Sautoy or Ian Stewart. These books 
stimulated you and started you thinking. You’d like to go further but don’t 
have the time or background to take a formal course - and self-study from a 
standard textbook is a little forbidding. 

• The confused. Y ou are at university and taking a beginner’s course in analysis. 
You are finding it hard and are seriously lost. Maybe this book can help you 
find your way? 

• The eager. You are still at school. Mathematics is one of your favourite 
subjects and you love reading about it and discovering new things. You’ve 
picked up this book in the hope of finding out more about what goes on at 
college/university level. 
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To read this book requires some mathematical background but not an awful 
lot. You should be able to add, subtract, multiply and divide whole numbers and 
fractions. You should also be able to work with school algebra at the symbolic 
level. So you need to be able to calculate fractions like ^ — | = | and also be able 

a c ad — be 

to deal with the general case — — — — . I’ll take it for granted that you can 

b d bd 

multiply brackets to get (x + y) (a + b) — xa + ya + xb + yb and also recognise 
identities such as the ‘difference of two squares’ x 1 — y 1 — (x + y) (x—y). Beyond 
this it is vital that you are willing to allow your mind to engage with extensive 
bouts of systematic logical thought. 

As I pointed out above, you don’t need to know anything about calculus to 
read most of this book (but anything you do know can only help). To keep things 
as simple as possible, I avoid the use of set theory (at least until the end of the 
book) and the technique of ‘proof by mathematical induction,’ but both topics 
are at least briefly introduced in appendices for those who would like to become 
acquainted with them. 

Sometimes I am asked what mathematicians really do. Of course there are 
as many different answers as there are many different traditions within the vast 
scope of modern mathematics. But an essential feature of what is sometimes called 
‘pure’ mathematics is the process of ‘proving theorems’. A theorem is a fancy way 
of talking about a chunk of mathematical knowledge that can be expressed in 
two or three sentences and that tells you something new. A proof is the logical 
argument we use to convince ourselves (and colleagues, students and readers) 
that this new knowledge is really correct. If you pick up a mathematics book in 
a library it may well be that 70 to 80% of it just consists of lists of theorems and 
proofs - one after the other. On the other hand most expository books about 
mathematics that are written for a general reader will contain none of these at all. 
In this book you’ll find a halfway house. The author’s goal is to give the reader a 
genuine insight into how mathematicians really think and work. So you’re going 
to meet theorems and proofs - but the development of these is going to be very 
gentle and easy paced. Each time there will be discussion before and after and - 
at least in the early part of the book, when the procedures are unfamiliar - the 
proofs will be spelt out in much greater detail than would be the case in a typical 
textbook. 

So what is the book about anyway? In once sense it is the story of a quest - 
the long quest of the human race to understand the notion of number. In a 
sense there are two types of number. There are those like the whole numbers 
1, 2, 3, 4, 5, 6 etc. that come in discrete chunks and there are those that we call 
‘real numbers’ that form a continuum where each successive number merges 
into the last and there are no gaps between them. 1 It is this second type of 


1 This is an imprecise suggestive statement. If you want to perceive the truth that lies behind 
it then you must read the whole book. 
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number that is the true domain of analysis. 2 These numbers may appear to be 
very familiar to us and we may think that we ‘understand’ them. For example 
you all know the number that we signify by n. It originates in geometry as the 
universal constant you obtain when the circumference of any (idealised) circle 
is divided by its diameter. You may think you know this number because you 
can find it on a pocket calculator (mine gives it as 3.1415927) but I hope to be 
able to convince you that your calculator is lying to you. You really don’t know 
tc at all - and neither do I. This is because the calculator only tells us part of 
the decimal expansion of n (with enough accuracy to be fine for most practical 
applications) but the ‘true’ decimal expansion of ?r is infinite. We are only human 
beings with limited powers and our brains are not adapted to grasp the infinite 
as a whole. But mathematicians have developed a tool which enables us to gain 
profound insights into the infinite nature of numbers by only ever using finite 
means. This tool is called the limit and this book will help you understand how it 
works. 


Guide for Readers 

There are thirteen chapters in this book which is itself divided into two parts. 
Part I comprises Chapters 1 to 6 and Part II is the rest. The six chapters in Part I 
can serve as a background text for a standard first year undergraduate course in 
numbers, sequence and series (or in some colleges and universities, the first half 
of a first or second year course on introductory real analysis). Chapters 1 and 2 
introduce the different types of number that feature in most of this book: natural, 
prime, integer, rational and real. Chapter 3 is the bridge between number and 
analysis. It is devoted to the art of manipulating inequalities. Chapters 4 and 5 
focus on limits of sequences and begin the study of analysis proper. Chapter 6 
(which is the longest in the book) deals with infinite series. 

Part II comprises a selection of additional interesting topics. In Chapter 7 we 
meet three of the most fascinating numbers in mathematics: e, it and y. Chapters 8 
and 9 introduce two topics that normally don’t feature in standard undergraduate 
courses - infinite products and continued fractions (respectively). Chapter 10 
begins the study of the remarkable theory of infinite numbers. Chapter 11 is 
perhaps, from a conceptual perspective, the most difficult chapter in the book as 
it deals with the rigorous construction of the real numbers using Dedekind cuts 
and the vital concept of completeness. Chapter 12 is a rapid survey of the further 
development of analysis into the realm of functions, continuity and the calculus. 

2 To be precise real analysis, which is that part of analysis which deals with real sequences, 
series and functions. This topic should be distinguished from complex analysis which studies 
complex sequences, series and functions and which isn't the subject of this book, although we 
do briefly touch on it in Chapter 8. 


IX 
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In Chapter 13 we give a brief account of the history of the subject and in Further 
Reading we review some of the literature that the reader might turn to next after 
reading this book. 

You learn mathematics by doing and not by reading and so each chapter 
in Part I closes with a set of exercises which you are strongly encouraged to 
attempt. 3 As well as practising techniques, these also enable you to further 
develop some aspects of the theory that are omitted from the main text (where 
explicit guidance is generally provided). So for example, in the exercises for 
Chapters 4 and 5 you meet the useful concept of a subsequence and can 
prove the Bolzano-Weierstrass theorem for yourself. Hints and solutions to 
selected exercises can be found at the end of the book. Professional educators 
can obtain full solutions by following instructions that can be found on http:// 
ukcatalogue.oup.com/product/9780199640089.do 

I would expect that most readers will have ready access to the internet and so I 
have included a lot of references to Wikipedia throughout the text. This is so that 
you can very quickly find out a lot more about a topic if you want to. However 
bear in mind that Wikipedia is not yet thoroughly reliable and you should never 
quote mathematical results found there unless you have also confirmed them by 
consulting an authoritative text. 


Note for Professional Educators 

As pointed out above, the book falls naturally into two parts. Part I shadows 
a first year university mathematics course on sequences and series but with a 
little bit of number theory thrown into the mix. No calculus is used in Part I 
except at the very end in an optional aside. There is also no set theory until the 
very end of the book. Part II is a collection of subsidiary topics. I feel freer to 
use calculus here - but only occasionally. Indeed elementary analytic number 
theory takes over some of the usual role of calculus in this book as motivation 
for learning analysis. I avoid the axiomatic method throughout this book. This 
is a pedagogic device rather than a philosophical standpoint. I believe it is more 
helpful for those encountering the properties of real numbers for the first time 
to first develop basic analytical insight into their manipulation. The niceties 
of complete ordered fields can then be left to a later stage of their education. 
Indeed as Ivor Grattan-Guinness writes in ‘The Rainbow of Mathematics’ (p.740), 
‘The teaching of axioms should come after conveying the theory in a looser 
version’. 


3 These are drawn from a variety of sources including some of the textbooks listed in 
Further Reading. 
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Part I 


Approaching Limits 



cr 

A Whole Lot of Numbers 


"Think about maths Felix, " Levin advised seriously as I walked out, 
"it’ll take your mind oTTyour nervous breakdown. " 

White Light, Rudy Rucker 


1.1 Natural Numbers 


T he numbers that we learn to count with are called natural numbers by 
mathematicians. If we try to make a list of them that starts with the smallest 
we begin 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, . . . and then at some stage we get 
bored and so we stop, writing the three dots ... to indicate ‘etcetera’ or ‘so it goes’. 
I stopped at 13 but there is no good reason to do this. We can go on to 100 or 127 
or 1000 or 1000000 or any enormously large number we care to choose. There are 
some impressively big numbers out there. At the time of writing the population of 
the world is estimated to be 6762875008, 1 and some predictions expect it to reach 
around 9 billion by 2040. 2 

This pales into insignificance when compared to some of the numbers we 
can write down such as 10 100 which is one followed by a hundred noughts. This 
number is sometimes called a googol. An even larger number is 10 lolo ° which is 
one followed by a googol of noughts and this is called a googolplex. It’s easy to 
create large numbers in this way but this is a process that has no end. There is no 
largest natural number and this leads us to use phrases like ‘the numbers go on to 
infinity’. What do we mean by ‘infinity’ when we make such a statement? Are we 
indicating some mysterious concept that lies beyond our usual understanding of 


1 See http://www.census.gov/ipc/www/idb/worldpopinfo.html 

2 See http://en.wikipedia.org/wiki/World_population 
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numbers? To probe further it would be useful to give a precise argument which 
makes it logically clear that there cannot be a largest natural number. This uses a 
technique that mathematicians call proof by contradiction which we will employ 
time and time again in this book, so it’s a good idea to get used to it as soon as 
possible. We begin by making an assertion that we intend to disprove. In this 
case it is ‘there is a largest number’. Let’s give this largest number a symbol N. 
Now suppose that N is a legitimate natural number. Then we can add 1 to it to 
get another number N + 1. Now there are three possibilities for relating N to 
N + 1. Either N + 1 is larger, equal to or smaller than N. Now if N + 1 is larger 
than N then we’ve created our desired contradiction as we have a number that 
is larger than the largest and this cannot be. If N + 1 = N then we can subtract 
N from both sides to get 1=0 which is also a contradiction. 3 I’ll leave you to 
work out for yourself the contradiction that follows from supposing that N + 1 is 
smaller than N. In all cases we see that if a number such as N exists then we have 
a contradiction. As contradictions are not allowed in mathematics, we conclude 
that a largest natural number cannot exist. When we say the ‘numbers go on to 
infinity’ we are really doing nothing more than describing the fact that counting 
can never end. From this point of view ‘infinity’ is not some mystical unreachable 
end-point but a linguistic label we use to indicate a never-ending process. 


1.2 Prime Numbers 


We can add natural numbers together to make new numbers to our heart’s 
content. We can also subtract b from a to make the natural number a — b but this 
only works if b is smaller than a. Addition and subtraction are mutually inverse to 
each other in that they undo the effect of each other. This vague wording is made 
more precise with symbols: (a + b) — b — (a — b) + b = a where the operation 
in brackets is always carried out first. Repeated addition of the same number to 
itself is simplified by introducing multiplication soa + a = 2xa which it is 
more convenient to write as 2a, and more generally if we add a to itself n times 
then we get na. Just as subtraction and addition are mutually inverse then so 
are multiplication and division. Indeed we say that b -E a = n or equivalently 
- a = n provided that b — na. It’s important to be aware that at this stage we are 
only dealing with natural numbers and so b -E a is only defined for our present 
purposes if there is no remainder when division is carried out — so 6 -E 3 = 2 but 
5 -E 3 is not allowed. Now suppose that we have a number b than can be written 
b — na so that b -E a = n and b -E n = a. We say that a and n are factors (also 

3 0 is not a natural number but this doesn't invalidate the argument. 
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called divisors ) of b, e.g. 2 and 3 are factors of 6 and 10 and 7 are factors of 70. 
More generally we can see that 6 is ‘built’ from its factors by multiplying them 
together. Now we’ll ask a very important question. 

Can we find a special collection of natural numbers that has the property 
that every other number can be built from them by multiplying a finite 
number of these together (with repetitions if necessary)? 

We will see that the answer to this question is affirmative and that the numbers 
that we need are the prime numbers or primes. A formal definition of a prime 
number is that it is a natural number that is greater than 1 whose only factors are 
1 and itself. So if p is prime we have p — Ixp — px\ but there are no other 
numbers a and n such that p — na. The list of prime numbers begins 

2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 

83, 89, 97, 101, ... 

Note that 2 is the only even prime number (why?) Prime numbers are on the 
one hand, rather simple things but on the other hand they are one of the most 
mysterious and intractable mathematical objects that we have ever discovered. 
First of all there is no clear discernible pattern in the succession of prime numbers - 
there is no magic formula which we can use to find the nth prime. At the start of 
the list of natural numbers, prime numbers seem quite frequent but as we progress 
to higher and higher numbers then they appear to be rarer and rarer. Later on 
in this section we will demonstrate two interesting facts. Firstly that there are an 
infinite number of prime numbers (and so there is no largest prime) and secondly 
given any natural number m we can (if we start at a large enough number) find 
m + 1 consecutive natural numbers, none of which is prime. 

Before we explore these ideas, we’ll return to the question that motivated us to 
look at prime numbers in the first place. 

A natural number that is bigger than 1 and isn’t prime is said to be composite. So 
if b is composite we can always find two factors a and c such that b — ac. Consider 
the number 720. It is clearly composite as 720 = 2 x 360 or 1 0 x 72. Let’s look more 
closely at the second of these products. As 10 = 2 x 5 and 72 = 6 x 12 = (2 x 3) x 
(3 x 2 x 2), after rearranging we can write 720 = 2x2x2x2x3x3x5 = 2 4 3 2 5. 
Now 2, 3 and 5 are all prime and the decomposition we’ve found is called the 
prime factorisation of 720. We’ve built it by multiplying together (as many times 
as were needed) all the prime numbers that are factors of our number. We call 
these numbers prime factors as they are both prime and also are factors of the 
number in question. Now suppose that we are given a general natural number n 
whose prime factors are p ] ,p 2 , . . . ,p N . We will write n = pf l p 2 2 ■ ■ -p™ N ■ This 
tells us that to get n we must multiply p 1 by itself m 1 times and then multiply 
this number by p 2 , m 2 times and keep going until we’ve multiplied m N times 
by the number p N . Notice the use of the notation • • • which we use for ‘keep 
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multiplying’ (in contrast to . . . which we met before and which means keep going 
along some list). So in the example we’ve just seen where n = 720, we have 
N — 3, pj = 2, p 2 — 3, p 3 — 5, m 1 — 4, m 2 = 2 and m 3 — 1. 

We’ll now show that every natural number has a prime factorisation. We’ll 
do this by using the ‘proof by contradiction’ technique that we employed in the 
first section to show that there is no largest natural number. This time we’ll 
set our argument out in the way that a professional mathematician does it. In 
mathematics, new facts about numbers (or other mathematical objects) that are 
established through logical reasoning are called theorems and the arguments that 
we use to demonstrate these are called proofs. Theorems are given numbers that 
serve as labels to help us refer back to them. So the theorem that we are about to 
prove, which is that every natural number has a prime factorisation, will be called 
Theorem 1.2.1 where the first 1 refers to the chapter we are in, the 2 to the fact 
that we are in section two of that chapter and the second 1 tells us that this is the 
first theorem of that section. This result is so important that it is sometimes called 
the fundamental theorem of arithmetic. 

Theorem 1.2.1. Every natural number has a prime factorisation. 

Proof. Let n be the smallest natural number that doesn’t have a prime 
factorisation. Clearly n cannot be prime and so it is composite. Write n = be, 
then either b and c are both prime, or one of b or c is prime and the other is 
composite or both b and c are composite. We deal with each possibility in turn. 
Firstly suppose that b and c are both prime. Then n — be has a prime factorisation 
and we have our contradiction. If b is prime and c is composite we know that c is a 
smaller number than n and so it must have a prime factorisation. Now multiplying 
by the prime number b produces a prime factorisation for n and we again have a 
contradiction. The case where b is composite and c is prime works by the same 
argument. If b and c are both composite then each has a prime factorisation and 
multiplying them together gives us the prime factorisation for n that we need to 
establish the contradiction in the third and final case. □ 

The symbol □ is a convenient notation that mathematicians have developed to 
signal that the proof has ended. 

We can make Theorem 1.2.1 into a sharper result by proving that not only 
does every natural number have a prime factorisation, but that this factorisation 
is unique, i.e. a number cannot have two different factorisations into primes. We 
will not prove this here (we are not going to prove everything in this book) but we 
will feel free to use this as a fact in future. You are invited to try to come up with a 
convincing argument yourself as to why this is true. Start with a natural number 
n and assume that it is the smallest one that has a prime factorisation using two 
different sets of prime numbers, and then see if you can get a contradiction. (Hint: 
You will need the fact, which we also haven’t proved here, that if a prime number 
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p is a factor of a composite number n then it is also a factor of at least one of the 
divisors of n.) 4 

We’ve seen that prime numbers are defined to have no factors other than 
one and themselves. Later on in this book we will need to have some knowledge 
of square-free numbers and this is an ideal place to introduce them. These are 
precisely those natural numbers that have no factors that are squares (other than 
one). So 12 is not square-free as 12 = 3 x 4 = 3 x 2 2 so 2 2 is a factor. All prime 
numbers are clearly square-free and so are numbers like 6 = 2x3. We start the 
list of square-free numbers as follows: 

1, 2, 3, 5, 6, 7, 10, 11, 13, 14, 15, 17, 19, 21, 22, 23, 26, ... 

Here are two useful facts about square-free numbers: 

1. The square-free numbers are precisely those for which every prime number 
that occurs in the prime factorisation only appears once. 

So in the prime factorisation of a square-free number n — p1 n p™ 2 ■ ■ ■ p’f N 
we must have m 1 = m 2 — ■ ■ ■ = m N = 1. For if any of these numbers is 
greater than 1 then we can pull-out a factor that is a square. 

2. Every natural number can be written in the form 

n =j 2 k , 

where j and k are natural numbers and k is square-free. 

To see that this is true, first take j = 1, then n — k and so we get all the 
square-free numbers. Now to get the rest of the natural numbers we need to 
be able to include all the squares and their multiples. To get the squares, just 
take k — 1 and consider all possible values of j. To get all numbers that are 
of the form 2 j 2 just take k = 2 and let j vary freely, and you should be able 
to work the rest out for yourself. 

For a numerical example, consider n = 3120. Then j = 4 and k = 195 
since 3120 = 4 2 x 195 and 195 = 3x5x13. 

Our next task is to give Euclid’s famous proof that there is no largest prime 
number. Before we do that, we need some preliminaries. Let n be a natural number. 
We use the notation n\ to stand for the product of all the numbers that start 
with 1 and finish with n, so n\ — 1 x 2 x 3 x • • • x (n — 1) x n. For example 
1! = 1, 2! = 1 x 2 = 2, 3! = 1 x 2 x 3 = 6, 4! = 1 x 2 x 3 x 4 = 24. I hope 
you spotted the useful formula n\ = n(n — 1)!, so that e.g. 5! = 5 x 4! = 120. n\ 
is pronounced ‘n factorial’ and it plays a useful role in applied mathematics and 


A The standard proof can be found online at http://en.wikipedia.org/wiki/Fundamental. 
theorem_of_arithmetic 
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in probability theory as it is precisely the number of different ways in which n 
objects can be arranged in different orders. 

The next thing we need are a couple of useful facts: 

Fact 1. If a number b is divisible by a then it is divisible by every prime 
factor of a. 

To see this, use Theorem 1.2.1 to write a = p'^pf 2 ■ ■ ■ pf N . Now since b is 
divisible by a we can write b = ac and so b — p" >l p 2 2 • • • Pn N c an d that’s enough 
to give you the result we need. 

We should also be aware of the ‘logical negation’ of Fact 1 which is that if b is 
not divisible by any prime factor of a then it is not divisible by a. This fact does 
not need a proof of its own as it follows from Fact 1 by pure logic. 

Fact 2. If b l is larger than b 2 and a is a factor of both numbers then it is also a 
factor of b 1 — b 2 . 

Since a is a factor of both numbers we can write b 1 — ac and b 2 = ad and so 
b l — b 2 — ac — ad — a(c — d), which does the trick. By the same argument you 
can show that a is a factor of b l + b 2 (whether or not b 1 is larger). 

We’re now ready to give the promised proof that there are infinitely many 
primes. This again appears in standard “theorem -proof’ form and we apply the 
same proof by contradiction technique that we used before to show that there are 
infinitely many natural numbers. 

Theorem 1.2.2. There are infinitely many prime numbers. 

Proof. Suppose that the statement in the theorem is false and let p denote the 
largest prime number. Consider the number P = p\ + 1. Our goal is to prove 
that there is a prime number q that is larger than p and this will then give us 
our contradiction. We’ll show first that P is not divisible by any prime number 
on the list 2,3, ... , p. Let’s start with 2. Suppose that P really is divisible by 2. 
Since p! = 1 x 2 x ■ ■ • x p is divisible by 2, so is P — p\ by Fact 2 above. But 
P — p\ — 1 which is not divisible by 2. So we have a contradiction and conclude 
that P is not divisible by 2. Now repeat the argument we’ve just given to see that 
P is not divisible by 3, 5, 7 or any prime number up to and including p. Now if 
P is divisible by any composite number a, then by Fact 1 it must be divisible by 
every prime factor of a. We’ve already seen that P is not divisible by any prime 
number on our list and so we conclude that either P itself is prime (in which case 
q = P) or it is divisible by a larger prime than p (in which case q is larger than p 
but smaller than P). This gives the contradiction we were looking for and so we 
can conclude that there is no largest prime number. □ 

Although there are an infinite number of prime numbers, these become rarer 
and rarer as we reach larger and larger numbers. The next result we’ll prove tells 
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us that somewhere within the vast multitude of natural numbers, we can find 
m + 1 numbers in succession, all of which are composite, for any m we can care 
to name. 

Theorem 1.2.3. For any natural number m we can find m + 1 successive natural 
numbers, none of which is prime. 

Proof. This is a ‘proof by demonstration’ where we simply show you how to 
construct what we need. Indeed I claim that the m successive composite numbers 
are given by the following list: (m + 2)! + 2, (m + 2)! + 3, . . . , (m + 2)! + (m + 
1), (m + 2)! + (m + 2). I hope you agree with me that there really are m + 1 
numbers on the list. Since (m + 2)! = 1 x 2 x 3 x • • • x m x (m + 1) x (m + 2), 
it is divisible by each of 2, 3, . ... m + 2. It follows that (m + 2)! + 2 is divisible by 
2 and so cannot be prime, (m + 2)! + 3 is divisible by 3 and cannot be prime, . . ., 
(m + 2)! + m + 2 is divisible by m + 2 and so cannot be prime and that concludes 
what we aimed to show. □ 

Mathematicians like to play (this is how we discover new things) and it’s fun 
to do this with the result we’ve just proved. Let’s take m — 2. The theorem tells 
us that there are three successive numbers which are not prime and it even tells 
us where to find these - they are 4! + 2, 4! + 3, 4! + 4 which are 26, 27, 28. But 
by searching directly we can find much smaller numbers than this - indeed none 
of 8, 9 and 10 are prime. Theorem 1.2.3 tells us that a certain list of m numbers 
exists but it doesn’t give any information about the smallest number where such 
a list might begin. As far as I know, there is no known answer to that question. 
One of the reasons why prime numbers are so interesting is that it is easy to 
state unsolved problems. For example Goldbach’s conjecture which dates back 
to 1742 remains unsolved. It says that every even number greater than 4 is the 
sum of two (odd) prime numbers. This is easy to verify for small numbers e.g. 
6 = 3 + 3,8 = 3 + 5,10 = 3 + 7 etc. but a general proof evades us so far. 5 By the 
way, the corresponding conjecture is false for odd numbers and you’re invited to 
find a counter-example, i.e. an odd number which cannot be written as the sum 
of two prime numbers. 

I’ve already commented on the fact that there is no known formula for 
generating all of the prime numbers. Mathematicians are fascinated with the 
patterns within prime numbers and the study of this comes within that part of 
the subject called number theory. On the other hand searching for very large 
prime numbers (and so discovering new ones) doesn’t really involve much ‘real’ 
mathematics but does require a vast amount of computer power. The largest 
prime discovered so far (at the time of writing) was found in September 2008 and 
can be written as 2 43112609 — l. 6 One of the quantities that mathematicians have 


5 See e.g. http://en.wikipedia.org/wiki/Goldbach's_conjecture for more background. 

6 See http://primes.utm.edu/largest.html 
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been particularly interested in is the ‘function’ 7t(n) which is defined to be the 
number of primes less than or equal to the natural number n? So for example, 
7i (2) = 1, 7T (3) = 2, 7T (4) = 2, 7T (5) = 3, jr(10) = 4, ?r(100) = 25, tt(IOOO) = 
168, 7r(10 9 ) = 50847534. 8 I emphasise that there is no exact formula that tells 
us what 7v(n) might be for arbitrarily large n. However there is an approximate 
formula. In 1896, two French mathematicians Jacques Hadamard (1865-1963) 
and Charles de la Vallee- Poussin (1866-1962) independently published their 
proofs of what has now become known as the ‘prime number theorem’. This 
tells us that as n becomes very large there is a sense in which 7r(n) gets closer and 
closer to (but never reaches) the quantity ■ Here \og e (n) is the logarithm to 
base e of the number n. 9 If you haven’t met it before, you’ll be able to find a lot of 
information about the number e in Chapter 7. Now let’s focus on the phrase ‘gets 
closer and closer to (but never reaches)’. My calculator tells me that log 1 ° 1Q) = 4.34 

which isn’t so far from the exact value of 4, fog 1 ^, = 21.71 which is reasonably 
close to the precise value of 25, log 1 ( > fo 00 ) = 144.76 which is in the right ball-park as 
168 but fog 1 = 48254942 which isn’t terribly close to 50847534. This doesn’t 
look at all convincing but the evidence we have presented here is sparse - just a 
handful of numbers. Also I haven’t yet told you the right way to compare tt (n) and 
lo g K (w j - but maybe you can guess this? We won’t prove the prime number theorem 
in this book as it uses far more advanced mathematics than we can hope to cover 
here, but one of the main themes we will present is the notion of a limit which 
gives a very precise meaning to this mysterious phrase ‘gets closer and closer to 
(but never reaches)’. This should at least enable you to reach some understanding 
of what the prime number theorem is really telling us. 


1.3 The Integers 


The natural numbers are wonderful things but they have a number of limitations. 
One of these is that they only enable us to count in one direction. If for example 
a company wants to balance their assets against their debts then we really need 
to give the numbers a direction so that the positive benefits of the assets can 
be weighed against the negative impact of the debts. We do this by introducing 


7 The use of the Greek letter tt here has absolutely nothing to do with the universal constant 
of that name which arises as the ratio of the circumference of any circle to it's diameter. 
Mathematics uses so many concepts that it's commonplace to employ the same symbol in 
different contexts where there is no fear of ambiguity leading to misunderstanding. 

8 See http://en.wikipedia.org/wiki/Prime_number_theorem 

9 If a = tf we say thatx is the logarithm to base b of the number a and write x = log 6 (o), 
e.g. 100 = 1 0 2 and so 2 = log, 0 ( 1 00). 
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-4 -3 -2 -1 0 1 2 3 4 5 6 7 

Figure 1.1. Integers as distances on a line. 


another copy of the natural numbers and putting a minus sign in front of these. 
Thus we create the negative numbers —1, —2, —3, —4, —5, .... We obtain the 
integers when these are combined together with the natural numbers. There is a 
neat way to see this visually as shown in Figure 1.1. Just draw a straight line on a 
piece of paper and mark the natural numbers in ascending order on the right so 
that there is a fixed distance between each successive number. We similarly mark 
the negative numbers in descending order on the left. 

Observe that a new number has entered the arena which is neither positive 
nor negative. This is of course zero - denoted 0. If we think of the line as 
marking steps on a journey then 0 is the starting point, the natural numbers 
mark steps to the right away from zero and the negative numbers are steps to 
the left away from zero. The integers are then the numbers which describe all of 
these distances from zero on the line (in both directions and including standing 
still) and if we try to write them in increasing order we do something like this: 

. . . , —5, —4, —3, —2, —1, 0, 1, 2, 3, 4, 5, . . . which indicates that these are infinite 
in both directions. Indeed we now have a geometric way of thinking about this 
twofold infinity in terms of our line which extends indefinitely in both directions - 
to the right and to the left. A little bit of (seemingly nit-picking) terminology will be 
useful for us later on. Natural numbers are sometimes also called positive integers 
and the natural numbers together with zero are called nonnegative integers. 

The description that we have given of the integers so far is ‘static’. We also need 
to be able to think of them ‘dynamically’, i.e. in a way that allows them to interact 
through arithmetic. Let’s start by going back to Figure 1.1. Start at zero and take 
m steps to the right and then n steps to the left. We are then at the point m — n. 
If m is bigger than n we will be on the right and so have arrived at a natural 
number while if m is smaller than n we have reached a negative number. If m and 
n are equal then we are back at zero. Addition of integers corresponds precisely to 
combining steps in this way. So for example 7 + (—3) = 4 is the result of taking 7 
steps to the right and then 3 to the left. You can see similarly that —7 + 3 = —4. 
We have also seen that for any natural number n, —n + n = 0 = n -\ — n — 0. 

Children often have difficulty in understanding the multiplication of integers. 
Now each non-zero number has a signature - it is either positive or negative and 
the rules for multiplication can be summed up in the following table: 

positive x positive = positive, 

positive x negative = negative, 

negative x positive = negative, 

negative x negative = positive, 
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so for example —7x4— —28 but (—7) x (—4) = —28 — 28. 10 It’s the last of these 
that often causes the most confusion. Why do two minuses make a plus? Here are 
two different attempts to explain this. One approach is algebraic and the other is 
dynamic and visual. 

1. Algebraic. We have already seen that 

n H — n — 0. 

Now multiply both sides of this equation by —1 and you should agree that 
we get 

— n + ( n) = 0. 

So we have that 

n + (— n) = —n + ( n). 

Now add n to both sides (or cancel — n from both sides) to conclude (after 
some rearrangement) that 

n + (— n + n) = ( n) + (— n + n). 

But we now have 


n + 0 = ( n) + 0, 

in other words n = n. u 

2. Dynamic. Let’s create the negative numbers from the natural ones. How shall 
we do this? We start with the number line as in Figure 1.1 but this time the 
negative numbers are missing as in Figure 1.2. 

Now think of the line as though it were sitting inside of a two-dimensional 
infinite plane. 12 Draw a line that is perpendicular to our number line and 
which passes through zero. This is demonstrated in Figure 1.3. 

Think of the perpendicular line as a mirror that creates an image of each 
natural number on the other side of the line. Then —1 is the mirror image 
of +1, —2 is the mirror image of +2 etc. —1 is a very special number in 
this context because it produces each reflected image by multiplication, i.e. 
— 1 x 1 = — 1, — 1 x 2 — —2. — 1 x 3 = — 3 etc. Now having gone from 
right to left, let’s go back to the other side of the mirror by taking the mirror 
image again. To go from left to right we must multiply by —1 again so 
-1 x -1 = 1, -1 x -2 = 2, -1 x -3 = 3 etc. 


10 The purpose of the brackets is to make the display easier to read. 

1 1 Some of you will notice our use of the commutative and associative laws of addition. 
I'm not going to make a fuss about these in this book. However they (and other algebraic 
properties of numbers) are listed in Appendix 4 for convenient reference. 

12 Mathematicians call this an 'embedding'. 
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0 1 2 3 4 5 6 7 

Figure 1.2. Natural numbers as distances along a line. 
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Figure 1.3. Obtaining the integers by reflection 


In this chapter we have worked entirely with whole numbers. Now we must split 
these apart and come to an understanding of how every point on the line in 
Figure 1.1 can represent a number. That is the task of the next chapter. 


1 A Exercises For Chapter 1 


The first six questions in this section are designed to help you practise simple proof 
techniques and the remainder focus on prime numbers. 


1 . Prove that if n is an odd number then n * 1 2 and n 3 4 are also odd. [Hint: Use the fact 

that if n is odd then n = 2m — 1 form = 1,2,3, ] 

2. Prove that if n 2 is an even number then n is also even. [Hint: Assume that n is 
odd and derive a contradiction.] 

3. Prove that if m is odd and n is even then m + n is also odd. If m is larger than n, 
is m — n odd or even? Justify your answer with a proof. 

4. Prove that the sum of four consecutive natural numbers is always even. 
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5. (a) Deduce that every odd number is either of the form 4m - 1 or 4m + 1 

where m is a natural number. Numbers of the form 4m + 3 are clearly also 
odd. How do you reconcile this with the previous assertion? 

(b) Show that the product of two odd numbers of the form 4m + 1 is also of 
this form. 

(c) Show that the product of an odd number of the form 4m + 1 and an odd 
number of the form 4m — 1 is itself of the form 4m — 1 . 

6. Show that any odd number that takes the form 3m + 1 must also be of the 
form 6n + 1 , where m and n are natural numbers. 

7 (a) It is rumoured that the numbers 2 P — 1 where p is a prime number are 
always prime. Start making a list of these numbers and find the smallest p 
for which the rumour fails. The numbers of the form 2 p — 1 which really 
are prime are called Mersenne numbers after a conjecture made by Marin 
Mersenne in 1 644. How many of these can you find 7 
(b) Fermat numbers (named after Pierre de Fermat) are those that are of the 
form 2 2 " + 1 where n is a nonnegative integer (n = 0, 1 , 2, . . .). Deduce 
that the first five Fermat numbers are prime but that the sixth is composite. 
(Hint: Seek a factor between 640 and 64 5.) 

8. Substitutes = 1,2,3,... inton 2 - n + 41 . How many of the numbers that you 
obtain in this way are prime 7 Do you think that this formula always generates 
prime numbers? If not give a counter-example. 

9. (a) Show that the proof of Theorem 1 .2.2 still works if the numberp! is replaced 

by p where p is the product of all the prime numbers up to and including p 
(p = 2. 3. 5. 7 • • -p). 

(b) Calculate the first six cases of p + 1 . Show that the first five of these are 
all prime but that the sixth is composite. [Hint: Seek a factor between 55 
and 65.] 

10. Imitate the proof of Theorem 1.2.2 to show that there are infinitely many prime 
numbers of the form 4n — 1 . [Hint: Assume that there is a largest prime numberp 
that takes this form and consider the number M = 4(3. 7. 11 - - - p) — 1 . Consider 
possible prime factors of the form 4 m — 1 and 4m + 1 separately and use the 
results of Question 5.] 13 


13 There are also infinitely many prime numbers of the form 4n + 1 . but the proof of this 
fact is too difficult to include here. 
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Let’s Get Real 


Historically Fractions owe their creation to the transition From counting 
to measuring. 

Philosophy oF Mathematics and Natural Science, Hermann Wey! 


2.1 The Rational Numbers 


A well-known nineteenth century mathematician named Leopold Kronecker 
(1823-1891) made a since oft-quoted remark at a meeting in Berlin in 1886 
to the effect that ‘The integers were made by God, everything else is the work of 
man’. 1 Whether or not we agree with this point of view, the needs of mathematics, 
science and technology require us to gain some insight into ‘everything else’. We 
begin the (metaphorical) descent from heaven to earth by introducing the rational 
numbers. These are all numbers that can be written in the form j, where a is an 
integer and b is a natural number. So they comprise all (positive and negative) 
‘proper fractions’ such as | , — | , 44 as well as ‘improper fractions’ such as — 
and hM . We always write rational numbers in their lowest terms so e.g. | 
etc. are all identified with 1 . Integers are included in the rational numbers - indeed 
we recognise the rational number j as the integer n. 

The name ‘rational numbers’ suggests that these are numbers that appeal to 
reason in some sense. This may well be the case but the terminology actually 
refers to the fact that such numbers express ratios, e.g. | may express the division 
of a plot of land or a cake into 7 equal parts of which you or I may be entitled 
to precisely 3. We can do arithmetic with rational numbers and I hope that you 
remember how to add fractions by finding the lowest common multiple of the 


1 'Die ganzen Zahlen hat der liebe Gott gemacht, ales andere ist Menschenwerk'. 
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denominators so e.g. 

1 1 _ 5 2 _ 7 

2 + 5 - To + Io - i(j' 

The general rule for addition of fractions is 

a c ad + bc 
b + d = bd ’ 

but when you use this formula you should always cancel the right-hand side down 
to its lowest terms. 

Similarly the general formula for multiplying fractions is 

ci c ac 

b d bd 

All rational numbers may be expressed in decimal form (or by a decimal 
expansion) so for example \ = 0.5, | = 0.25, y = 6.4. The three numbers that 
I’ve written so far have finite decimal expansions but this is not always the case. 
Even the simple fraction | can lead us to a contemplation of the infinite for 
its decimal expansion is | = 0.3333333 • • • where the dots indicate that there 
is no end to the process of writing 3s. This is sometimes written as 0.3 and 
called a recurring decimal. What does this infinite train of 3s actually mean? Well 
0.3 = y , 0.33 = y + yfo while 0.333 = y + yy + ^ so the meaning of 0.3 
appears to be an infinite sum of fractions 

• _ 3 3 3 3 3 

03 _ To + loo + 1000 + To 1 4 h 10®® 4 ’ 

One of the tasks of later chapters will be to make sense of infinite sums of this type. 

If I ask my pocket calculator to give me the value of | it delivers the answer 
0.3333333. This is a lie. Of course calculators are designed to help us make 
practical calculations and not to explore fundamental mathematical truths, so the 
approximation of the precise expression | by the first eight digits of its decimal 
expansion 0.3333333 is probably going to be enough for most everyday purposes. 
There are many other fractions that fail to have a finite decimal expansion, 
e.g. y = 0.27272727 • • • which can be written more succinctly as 0.27. 

An interesting fact about many fractions that have infinite decimal expansions 
is that they are periodic in the sense that there is a pattern that endlessly repeats 
itself. This is obvious in the case of | or y. It is much less so if I ask my pocket 
calculator to find It tells me that the answer is 0.1428571 and there isn’t any 
particular evidence of a pattern here, however if you continue to expand (using 
more powerful software or good old-fashioned long division) you obtain 


1 

- = 0.142857142857142857 
7 


so the pattern here is an endless repetition of the digits 142857 first as y 


_J_ J ? | 8_ 

100 ' 1000 T 10 4 


5 

10 5 


W then as W 


4 

10 s 


2 

10 9 


_5 L. 

10 10 + 


i 0 n + 10 7 i2 an d 
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so on, indefinitely. The ‘dot notation’ that was mentioned above for succinctly 
representing | and jy in decimal form can also be extended to cases like this by 
placing a dot above both the first and last integers which appear in the block that 
is repeated, so 1 = 0T42857. 

In fact it is possible to prove a theorem (although we won’t do so here) to the 
effect that every fraction has a decimal expansion that is either 

1. finite, e.g. | = 0.25, 

2. periodic, as described above, 

3. eventually periodic, i.e. periodic behaviour doesn’t start immediately but it 
must do eventually (after a finite number of non-periodic numbers have 
appeared), e.g. | = 0.166666 • • • = 0.16. 

Kronecker believed that rational numbers were ‘less pure’ than the integers, 
but we seem to commit an act of violence when we try to represent a fraction 
such as | by an infinite decimal which will never have enough finite terms to 
capture the true meaning. There is another sense in which a decimal expansion 
is unsatisfactory and that is because it involves a ‘choice of base’. When we write 
| as a decimal, each successive term is obtained by taking a higher power of 10 
on the denominator. But why should we privilege the number 10 in this way? In 
fact it is a convention that comes from the fact that standard human beings have 
ten fingers (or toes). If we used the number 2 as a base (as is common in binary 
arithmetic) we would write | =0.001 because 

1 _ 0 0 1 
8 ~~ 2 + 7? + 2 1 ’ 

indeed we will see later on in Section 6. 1 1 that in this case the number | has the 
recurringbinary expansion 0.0101010 • • • = 0.01. But if we choose 3 as a base then 
we can avoid the infinite in dealing with | as it has a finite ‘trinary’ expansion 0.1. 

Now let us return to the number line on which we represented the integers in 
the last chapter as steps away from zero. The rational numbers give us much more 
flexibility as the diagram in Figure 2.1 shows. 

Indeed it appears to the eye that the rationals are filling up all the spaces on 
the line. For example, let’s try to get as close to zero as we can. We can take a very 


-2.95 -% 14 i% 


‘/2 

-3 -2 -10 12 2.5 3 


4 


Figure 2.1 . Some rational numbers on the number line. 
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1234567891 
10 18 10 18 10 18 10 18 10 18 10 18 10 18 I0 18 10 18 10 17 

Figure 2.2. Magnified picture of the number line near zero. 


small number indeed such as This seems imperceptibly close to zero and yet 
we can easily find nine smaller numbers , yyp , . . . , j/g which are all smaller 
than jdj- as shown in Figure 2.2. But why stop at 17 and 18? By choosing larger 
and larger numbers in the denominator, we get smaller and smaller fractions that 
are surely filling up all the space on the line close to zero, aren’t they? I often used 
to ask first year undergraduate mathematics students to vote on this question. 
About half of them usually agreed that the rational numbers really do fill up the 
whole number line. What do you think? 


2.2 Irrational Numbers 


To answer the question that we introduced at the end of the last section, we first 
need to focus on square roots. I’ll remind you that a nonnegative rational number 
b has a square root a if a 2 = b. Let’s list some numbers that have square roots. 
The square root of 0 is 0 as 0 2 = 0. The number 1 has two square roots 1 and 
— 1 as l 2 = 1 and (— l) 2 = 1. The number 4 also has two square roots 2 and —2. 
In fact if b is positive and it has a square root a that is positive then —a is also 
a square root since (—a) 2 — a 2 — b. We always write the positive square root as 
a = s/b and so —a — —s/b. So we’ve seen that VT = 1, s/i = 2 and similarly 
V9 = 3, */l6 — 4, y/25 — 5 etc. Some fractions also have easily obtainable square 

roots, e.g. = | . Now the numbers that we have looked at so far are all square 
roots of perfect squares (or fractions built from these) - in other words we have 
just identified the square root by noticing that the number we are square-rooting is 
l 2 , 2 2 , 3 2 , 4 2 , . . ., But what about \/2, a/3, s/5, s/ 6, sfl etc.? Do these expressions 
have a meaning? We are now encountering what is sometimes referred to as the 
‘great crisis of Greek mathematics’. One of the great triumphs of antiquity is 
Pythagoras’ theorem which states that in a right-angled triangle the length of the 
longest side c is related to that of the other two sides a and b by the beautiful 
formula: 

c 2 — a 2 + b 2 , (2.2.1) 

so if, for example a — 3 and b = 4 then we must have c — 5. Now suppose that I 
put a = 1 and b = 1 , then c 2 — 2 and I have constructed a right-angled triangle 
whose longest side c — /l as is shown in Figure 2.3 below 
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Figure 2.3. Right-angled triangle with sides of length 1, 1 and \/2. 

-VI -V2 V2 V3 V6 

I I II I 

1.5 

-3-2-10 1 2 3 4 

Figure 2 A. Irrational numbers on the number line. 


Indeed if I look up ~Jl on my pocket calculator I get ‘-s/2 = 1.4142136’ 2 so 
I even know the first eight terms in the decimal expansion of this number. Now 
I can construct another right-angled triangle with smaller sides a = -s/2 and b — 1 
to find that c — y/3 and my calculator reports that ‘V3 = 1.7320508’. Next I find 
\/5 from a right-angled triangle whose shortest sides are of length 1 and 2. Since 
positive square roots are unique it follows that if three numbers x, y and z are 
related by x — yz, then v /r = ^/y*fz and so we do not even need to draw a triangle 
to obtain -s/6 (although we can if we want to) - it must be Jl x V 3. Now we 
have identified -s/2, V 3, etc. as lengths of the longest side of a right-angled 

triangle and we can even determine these to a high level of accuracy using our 
pocket calculators, 3 so we can incorporate these into the real number line as in 
Figure 2.4. Greek mathematicians knew about these things but they also knew the 
result of the next theorem and it is this which is the heart of the crisis alluded to 
earlier. 

Theorem 2.2.1. If p is a prime number then is not a rational number. 


2 The reason for using quotes is that (as we have already seen) calculators can't be trusted 
to tell the whole story when they deal with decimal expansions. 

3 Of course this presupposes the mathematics necessary to do these calculations. 
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Proof. Let us assume that yfp is rational and seek a contradiction. So we write 

2 

yfp = l and then square both sides to get p = |j so that b 2 = a 2 p. 4 

Now we recall Theorem 1.2.1, that every natural number has a prime 
decomposition. We write a in terms of its prime decomposition as a = 
2 m ' 3 m2 5 m3 • • • q ,nN where q is the largest prime number that we need and N tells 
us how many prime numbers we have to count in order to get to q. s If we square 
this we get 


2^^! ^2m2 ^2m3 __ __ _ 


Nowwedothesameforh. Wewriteitsprime factorisation ash = 2" 1 3" 2 5" 3 • • • r n M 
and square this to get 

b 2 = 2 2ni 3 2ni 5 2 ' n ■ ■ ■ r 2nM 


Now let’s return to the equation b 2 — a 2 p and substitute in our prime 
factorisations for a 2 and b 2 . We obtain 


3 2n 2 f 2n 2 . . . f 2n M — 2 2 m l3 lm 2f lm 3 


■ q 2mN p. 


Now if the number p doesn’t appear on the left-hand side we already have a 
contradiction so let’s assume that it does and that it is one of the numbers 
2,3,5, . . . , r. Each of these prime numbers appears an even number of times 
on the left-hand side. Now on the right-hand side either p is not one of the 
numbers 2, 3, . . . , q in which case it only appears once altogether, or it is in that 
list of numbers in which case the extra multiplication by p means that it appears 
an odd number of times. Either way we have a contradiction and so we conclude 
that yfp cannot be a rational number. □ 


The conventional story is that Greek mathematicians BCE 6 were not prepared 
to accept that quantities that could not be expressed as rational numbers could 
be legitimate numbers. It wasn’t until the Renaissance that mathematicians began 
to feel comfortable with these numbers and this is believed to have held back the 
development of science and mathematics for several centuries. Nowadays we call 
numbers like yfp irrational numbers. This is not because we believe their existence 
to be an affront to reason - the word ‘irrational’ here should be interpreted as ‘not 
a ratio of integers’. 

In mathematics we use the term ‘corollary’ to label a new result which follows 
fairly easily from a theorem that has just been proved without us having to do 
very much extra work. The first corollary in this book is one that we obtain from 
Theorem 2.2.1: 


* Of course you can assume that the fraction is written in its lowest terms but this isn’t 
necessary for the proof to work. 

5 If for example, 3 doesn't appear in the prime factorisation of a, the notation we've used 
still makes good sense as we then have m 2 = 0 and so 3 m2 = 3° = 1 . 

6 BCE is a secular term meaning “before the common era” and is used as an alternative to 
BC (“before Christ") - see e.g. http://en.wikipedia.org/wiki/Common_Era 
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Corollary 2.2.1. There are an infinite number of irrational numbers. 

Proof. We saw in Theorem 1.2.2 that there are an infinite number of prime 
numbers while Theorem 2.2. 1 tells us that each of these prime numbers has an 
irrational square root. It follows that the list \/2, V3, s/5, \fl , . . . has no largest 
member and that’s what we set out to prove. □ 

We have shown that there are already infinitely many irrational numbers that 
take the form ^fp where p is a prime number. In fact there are many more 
irrational numbers than this and later on in, Chapter 7, we will look at proving 
the irrationality of some very famous numbers such as e - the base of natural 
logarithms and n - the ratio of the circumference of any circle to its diameter. But 
for now let’s stay with square roots for just a little longer. We’ve seen that all prime 
numbers have irrational square roots and we know that the perfect squares have 
square roots that are integers. What about the square roots of other composite 
numbers such as 6, 8, 10, 12, 14, 15 etc.? The answer to this question can be found 
in the following theorem: 

Theorem 2.2.2. If N is a natural number then either it is a perfect square or Vn 
is irrational. 

Proof. Suppose that N is not a perfect square and suppose that «/N is a rational 
number. We’ll try to obtain a contradiction. First we write «fN as a (nonnegative) 
integer plus a fraction in its lowest terms i.e. 

r- b 

VN = fl+-, (2.2.2) 

c 

where a, b and c are natural numbers. We also know that b is a smaller number 
than c with the fraction being written in its lowest terms. Now multiply both 
sides of the expression labelled (2.2.2) by c and square both sides to get 

c 2 N = (ca + b) 2 

= c 2 a 2 + 2 cab + b 2 . 

Using some simple algebraic manipulations we see that 

b 2 — c 2 N — c 2 a 2 — 2 cab 
— c{cN — ca 2 — 2 ab). 

This tells us that c is a factor of b 2 so that b 2 — cd where d = cN — ca 2 — lab is 
a natural number. So c — f f-. If you substitute for c in (2.2.2) you can check that 
you get 

VN = a+^. (2.2.3) 

b 
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Now we’ve already pointed out that b is smaller than c. We also have that d is 
smaller than b for otherwise cd would be the product of two numbers both greater 
than b and this must give a larger number than b 2 . 7 That would contradict the fact 
that b 2 = cd. This means that i is a fraction written in lower terms than - and 

b c 

that gives the contradiction we need to prove the theorem. □ 

So Theorem 2.2.2 tells us that \/6, V8, VlO etc. are all irrational numbers. By 
using some simple algebra we can construct many more irrational numbers using 
those that we already have. I could present the following in standard ‘theorem- 
proof mode but let’s have a change and use a list. (The label T here stands for 
‘irrational’.) 

Algebra of Irrational Numbers 

I(i) If x is an irrational number and a is rational and non-zero then ax is 
irrational. 

To see that this is true suppose that ax is rational and look for a 
contradiction in the usual way. So we write ax — | where q is an integer 
and p is a natural number. But a is rational so a = f This tells us that 



i.e. x — — which is rational and that’s the contradiction we were 

cp 

looking for. 

So we now see that numbers like 12^3 and ^ are irrational. 

I(ii) Ifx is irrational and a is rational then x + a is irrational. 

This is more-or-less the same argument as the one we just used so we 
suppose that x + a is rational and write x + a — j . But a — £ is rational 
and so 

e b ec — bf 

f c cf 

which is a rational number and that’s the contradiction we wanted. 

This result together with that of Theorem 2.2.2 tells us that numbers 
like 100 + V99 are irrational. If we also combine it together with that 
of (I(i)) we see that the number | is also irrational. This is a very 
famous number - it was called the golden ratio (or golden section) by the 
Ancient Greeks. We will have more to say about it later on in the book. 8 

I(iii) Ifx is a positive irrational number then «Jx is also irrational. 

2 

To see this we suppose that Jx — f is rational, but then x — (V*) 2 = fi 
is also irrational and there is the contradiction we needed. 

7 d cannot be equal to b. Why 7 

8 If you can't wait then have a look at http://en.wikipedia.org/wiki/Golden_ratio 
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As an example let’s look at y/2 + a/ 3. We’ll show it is irrational by 
using I(iii) to express it as the square root of an irrational number, indeed 

(>/2 + V3) 2 = 2 + 3 + 2V2V3 = 5 + 2 a / 6 , 

which is irrational by I(i) and I(ii). You might try to generalise this 
argument to show that *Jp + ^/q is irrational whenever p and q are 
distinct prime numbers. However, beware - it is not true that the sum of 
any two irrational numbers is itself irrational, e.g. V5 + (4 — a/5) = 4. 

I(iv) If a is an irrational number then so is i. 

The argument that proves this is a fairly simple variation on those given 
above and so I feel comfortable leaving it to you to do yourself. 

All irrational numbers have decimal expansions but I’m not going to tell you 
how to find them at this stage. Here are some examples where I’ve given the first 
nine decimal places: 

V2 = 1.414213562... 
a/3 = 1.732050808..., 

1 a/5 

The golden section - H — — = 1.618033988 . . . , 
n = 3.141592654.... 


Decimal expansions of irrational numbers are always infinite and can never 
be periodic or even eventually periodic. The same is true if we choose expansion 
in any base other than 10. This means that irrational numbers are less tangible 
than rational numbers. If we only know a rational number through its decimal 
expansion then we know that eventually we will find some periodic pattern 
(though we may have to go beyond (say) a googol of decimal places to find 
this). On the other hand, the decimal expansion of an irrational number is 
essentially unknowable, although amateur mathematicians have lots of fun 
in calculating numbers like it to greater and greater precision. Indeed as 
of August 2010 it was known to 5,000,000,000,000 decimal places (see e.g. 
http://en.wikipedia.org/wiki/Chronology_of_computation_of_%CF%80). On the 
other hand let’s suppose that you are playing a game with a friend wherein they 
give you a number and you have to guess what it is. Let us suppose that they 
give you the first 5,000,000,000,000 decimal places of the number and that you 
are clever enough to recognise these as coming from jr. Can you then say that 
the number really is tt? The answer is no! For all you know you might have been 
presented with a rational number that agrees with it to the first 5,000,000,000,000 
decimal places but is (for example) zero from the 5,000,000,000,001st decimal 
place onwards. 
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2.3 The Real Numbers 


Let’s return to the number line drawn in Figure 1.1. We tried to fill this line up with 
rational numbers representing every distance from the origin to a point on the line. 
This process failed. We have seen that there are infinitely many irrational numbers 
like ~Jl that are not rational but which still measure legitimate distances along 
the line. When we combine the rational and irrational numbers together there are 
no more gaps. These numbers are enough to measure every distance along the 
line. A number that is either rational or irrational is called a real number. Note 
that (as usual) the word ‘real’ here is just a name - you should not believe that 
real numbers are any more or less ‘real’ than other types of number that you may 
encounter. 9 At this stage you should have a reasonable intuition as to what a real 
number is, but be aware that we haven’t really given a satisfactory definition of 
them. This is because we haven’t properly defined irrational numbers - all we’ve 
done is provide some examples of them. We could give a working definition of a 
real number as ‘a number that has a decimal expansion’ but that is far too indirect 
as is the geometric definition in terms of distances along a line from an arbitrary 
point. Real numbers he at the heart of much of pure mathematics and are central 
to applications in science and engineering, and yet they are strangely elusive. We 
will return to the problem of how they should be properly defined - but not until 
the very end of the book. 

Not knowing what real numbers are shouldn’t stop us working with them 
and so we’ll take for granted that we can add, subtract, multiply and divide real 
numbers and that the answer is always a real number (as long as we don’t divide 
by zero). How do we add two real numbers? Just add the decimal expansions in 
the usual way. Of course you will never see the whole number but the first few 
terms is enough isn’t it? For example ~Jl + re — 4.555806216 . . .. We will always 
assume in this book that real numbers obey the standard algebra that we expect of 
numbers. So for example we assume that the commutative law of addition holds 
so that for all real numbers a and b : 


a + b — b + a, 

and we will assume that multiplication and addition interact through the 
distributive law, 10 i.e. 

a(b + c) — ab + ac, 

for all real numbers a, b and c. It is fairly easy algebra to prove these for rational 
numbers. For real numbers this must await a proper definition. 

9 For example 'imaginary', and more generally ‘complex numbers’ but we won't consider 
these until Chapter 8, and even then they only make a brief appearance. 

10 See Appendix 4 fora full list of these algebraic properties of numbers. 
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Before we (temporarily) leave the realm of the finite, it’s worth listing some 
fascinating facts about irrational numbers which demonstrate their relative 
‘weight’ in comparison to the rational numbers. 

• There are infinitely many rational numbers and infinitely many irrational 
numbers but there are more irrationals than rationals! The irrational 
numbers lay claim to a higher order of infinity. 

• There are an infinite number of irrational numbers between every two 
rational numbers p and q - no matter how close together p and q might 
be on the real number line. 

• There is a rational number between every two irrational numbers. 

• Consider all the real numbers between 0 and 1. This forms a portion of the 
number line that has length 1. Take away all the fractions on that line. Then 
the length of the line is still 1 - so the rational numbers contribute nothing 
to the length of the line. 

We’ll give proofs of all of these facts later on (except the last which is a little 
too sophisticated for this book). At the moment, you might be thinking that the 
real number line is a much more complicated object than you expected - and that 
is a very good way to be thinking. 

I’ll make one more comment on this for now. One way of dividing up the world 
is into the ‘discrete’ and the ‘continuous’. Discrete phenomena are separated from 
each other like cows in a field or the beats of a drum. Continuous phenomena 
appear to flow into each other like the paint on the wall next to my computer. 
All of our direct experience with numbers is with the discrete through counting 
(the natural numbers) or dividing into pieces (the rationals). But the real numbers 
are at a different level as they measure the ‘continuum’ that is represented by the 
number line. The essential difference between the discrete and the continuous 
is captured by the irrationals and this is one of the reasons why we find them 
so strange. They are the first type of number to take us away from our direct 
experience of the world around us. 


2.4 A First Look at Infinity 


The real numbers are all finite numbers. No matter how difficult it may be to 
pin down an irrational number through its decimal expansion, it still represents 
a fixed point on the number line that measures a finite distance from the origin. 
But the line is infinite in extent. What does this really mean? Well this is the same 
phenomenon that we encountered at the end of Section 1.1. No matter how large 
a real number we take, we can always extend the line a little bit further in length 
to get larger real numbers. But what about the full infinite extent of the line? Can 
this be represented by a number? 
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2 A A FIRST LOOK AT INFINITY 


We are always taught at school that division by zero is not allowed but let’s 
define a new ‘number’ which we will call ‘infinity’ and represent by the symbol oo 
as follows: 


oo=-. (2.4.4) 

0 

Now whatever it is oo cannot be a real number, but there is some sense to 
(2.4.4) for, dividing by a very small number always gives a very large number (e.g. 

= 10 26 ) - so why not go the whole hog and divide by zero? Well we can (up 
to a point) but as we will see - we have to be very careful. 

If you read Section 1.1 again, your first question might be - what about oo + 1? 
Surely this must be a bigger number than oo? Let’s find out by adding in the usual 
way and then 


oo+l 


1 1+1x0 

- + 1 = 

0 0 


1+0 

0 



So oo + 1 = oo, which is intuitively neat as it tells us that once we get to infinity 
we can’t get any bigger by adding one - the mighty infinite absorbs the puny finite 
into itself. Indeed by the same argument that we just presented you can show that 
oo + c — oo for any real number c. 

We also have 


1 1 1 

00 + 00 = 200 = 2- = 7 r = - = oo, 

0 § 0 


and by a similar argument c x oo = oo for any real number c. 

But before we get carried away let’s return to oo + 1 = oo. Surely if (as we’ve 
assumed so far) normal arithmetic applies, then we should be able to subtract oo 
from both sides of this equation to get 1 = oo — oo. But we also have oo + 2 = oo 
and that yields 2 = oo — oo, so we have found that 1=2 which is nonsense. Now 
at this point we can stop and conclude that defining oo = | is a bad definition 
as it leads to a contradiction, or we can continue on the basis that the rules of 
ordinary arithmetic are suspended for infinite numbers and that, in particular, 
oo — oo makes no sense and so is banned from mathematics. Let’s continue on 
that basis and see what happens. Since e.g. lxoo = 2xoo = oowe can also 
reject ^ = jj as being meaningless expressions. 

Now if you’ve accepted the story so far you might argue that there should 
be two infinite numbers - one that measures the positive infinite distance from 
zero to the never-ending right of the number line and another that measures the 
distance to the left. If we identify oo with the infinite positive quantity, then the 
negative one should be — oo = =2. But notice that 

-1 -lxl 1 1 
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so we’ve shown that — oo = oo! 11 Now either this tells us that infinite numbers 
are too crazy for their own good, or perhaps that the infinite line should really 
be extended to some sort of infinite circle where positive and negative infinities 
can meet and merge. In any case it’s time to stop this limited and unsatisfactory 
exploration of the infinite. From now on, any attempted use of ‘oo = |’ will be 
banned from our mathematical world. In Chapter 10 we’ll meet some of the ideas 
of the nineteenth century German mathematician Georg Cantor who gave a more 
sophisticated approach to infinite numbers. For now, it’s time to get our feet back 
on the ground! 


2.5 Exercises For Chapter 2 


1 . Exhibit the number0.81 25 in binary notation. 

2. Use long division to write yj as a recurring decimal. 

3. (a) Consider the recurring decimal x = 39. By writing 99x = 1 0Ox — x, show 

thatx = as a fraction in its lowest terms. 

(b) Use a similar technique to write 0.51 07 as a fraction in its lowest terms. 

4. Write down five rational numbers between 5gy and 5g. 

5. Which (if any) of the following numbers are irrational (a) V 1 296, (b) Vl297? 
(Try to answer this question without using a calculator.) 

6. Is it true that ifx andy are irrational numbers then their product xy is always 
irrational? Give a careful proof ora counter-example to the claim. 

7. This question develops an alternative proof (which is very often presented in 
textbooks) that V2 is irrational. Assume V2 is rational and so can be written £ 
as a fraction in its lowest terms. Square both sides and so deduce that p 1 2 , and 
hence p, is even. Now write p = 2m for some natural number m. Show that q 
is even and find the contradiction in our assumption. 

8. Use a similar argument to that of Question 7 to show that 3 4 5 6 \/2 is irrational. 

9. Can you find two irrational numbers a and b such that a b is rational 7 8 9 10 [Hint: 

/—y/2 

Thinking about the number V2 is a good place to start.] 

10. In 1933, a Cambridge undergraduate, David Champernowne, studied the 
real number 0.1 23^56789101 1 12131-915161718192021 ... formed by 


11 The identification of oo with — oo has a sound geometric intuition behind it - see the 
section on inversion in the circle in To Infinity and Beyond' by Eli Maior. This book is briefly 
discussed in the Further Reading section. 
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the natural numbers written in sequence. Is this number rational or irrational? 
Give a reason for your answer. Write down one rational and one irrational 
number strictly between Champernowne’s number and 0.1 2 34 567891 6. 

1 1 . Assume that oo can be treated like an ordinary number. What sense can you 
give to -v/oo? 
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The Joy of Inequality 


"Are you content now?" said the Caterpillar. "Well I should like to be a little 
larger. Sir, it you wouldn't mind," said Alice 

Alice in Wonderland , Lewis Carroll 


3.1 Greater or Less? 


W hen we first learn mathematics, equations are a dominant theme. The 
essence of an equation lies in the fact that there is an ‘equals sign’ = which 
separates two seemingly distinct expressions, and our job is to solve the equation 
which involves finding conditions under which the two expressions really are the 
same. For example consider the quadratic equation 


x 2 = 9 


5x. 


(3.1.1) 


With a little bit of algebra we can show that this is equivalent to 

x 2 — 5x + 6 — 0, 


which has two solutions x — 2 and x — 3. So the expression on the left-hand 
side of (3.1.1) is equal to that on the right-hand side when x = 2 or when x = 3. 
Otherwise they will be unequal. 

In this chapter we will shift the emphasis from equality to inequality. Of course 
we are much more used to seeing inequality than equality in the world around 
us, so it should be no surprise that mathematicians have developed extensive 
techniques for investigating this. We will see that these techniques are absolutely 
vital for exploring the world of limits. 
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Let’s try to compare two distinct real numbers a and b. The most basic 
inequality of all is 

a / b, (3.1.2) 

where the line through = nicely symbolises the act of cancelling the possibility that 
a and h might be equal. So (3.1.2) wouldbe satisfied if e.g.a = 2.713andb = 2.714. 
This is all very nice but it won’t take us very far. The serious players in the study 
of the unequal are four rather more subtle relations which we denote <, <, > and 
>. Let’s meet these one at a time. We write a < b whenever a is strictly smaller 
than b. The word ‘strictly’ used here emphasises the fact that equality a = b is 
excluded. Equivalently a < b whenever b — a is a positive real number. So for 
example 3.12 < 3.1213. We say that a < b whenever a is either smaller than or 
equal to b and the line under the < sign indicates that a — b is no longer excluded. 
So we can certainly write 3.12 < 3.1213 but it is also correct to say 3.12 < 3.12. 
Having the two symbols < and < may seem like hair-splitting but we will see that 
the distinction that these allow us to be able to make can be quite crucial. Now if 
a is less than b then b is greater than a and the symbol > nicely encapsulates this 
reversal of roles, indeed we can define a > b to mean the same thing as b < a and 
a > b to be equivalent to b > a. You might like to try proving the (obvious?) fact 
that a < b and b < a if and only if a — b . It’s also worth pointing out that a < b 
implies a < b, but the converse statement is false. 

The symbols <, <, > and > are sometimes called order relations as they allow 
us to capture the natural order structure of the real number line whereby numbers 
get greater as you move to the right. Be aware though that when we consider 
negative numbers we have for example — 2 < — 1 and sometimes people find this 
to be counter-intuitive as —2 has a larger magnitude than — 1. But —2 is further to 
the left than — 1 on the real number line and so is smaller, as shown in Figure 3.1. 
If in doubt, it’s always best to go back to the definition. Recall that we’ve agreed 
that a < b means that b — a is positive. Now put a = —2 and b = — 1 then 
b — a — — 1 — (—2) = — 1 + 2 = 1 which is certainly a positive number. 

We need to be able to manipulate inequalities and understand how they interact 
with addition, substraction, multiplication and division. We already have one 
important rule which is effectively the definition of <, but it’s worth writing this 
down again and giving it a name (LI) so that we can refer back to it later (the L here 
stands for ‘less than’ as we are collecting together a list of rules for manipulating 
the ‘less than’ symbol <): 

(LI) a < b if and only if b — a > 0. 


- 2-10 1 2 
Figure 3.1. —2 < —1 < 0 < 1 <2. 
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We also have a version of this for <: 

(LI)' a < b if and only if b — a > 0. 

I’ll now list and prove some other very useful rules for manipulating 
inequalities. All of these will be expressed in terms of < but they all have < versions 
which are obtained by carefully substituting every < with < in a systematic 
manner. 


3.1.1 Algebra of Inequalities 

(L2) Adding a constant. If a < b then 

a + c < b + c for any real number c. 

To prove this we’ll use a nice little technique which I call the ‘mathematician’s 
favourite trick’ (or MFT for short). So I’ll tell you what MFT is first before we go 
further. Suppose that we have an expression (which might be quite complicated) 
that is equal to some real number x. First write 

x — x + Q, 

then we can write 0 — y — y — —y + y where y is any real number, so that 
x — x + (-y +y) = (x - y) + y. 

Now the point of the MFT is that in some situations we can find a y so that 
both x — y andy are easier to deal with than x and this is often the key to making 
further progress. Don’t worry if this seems rather strange to you - we’ll be using 
the MFT quite a lot in this book so it should be quite a familiar tool by the time 
we reach the end. 

Now we’ll prove (L2): 

If a < b then b — a > 0 so by MFT 

b + c — a — c> 0, 
i.e. {b + c) — (a + c) > 0, 
i.e. a + c < b + c. 

You can extend (L2) to a stronger result by more or less the same reasoning, 
to show that if a, b , c and d are four real numbers such that a < b and c < d then 
a + c < b + d. You may like to try to prove that yourself. I hope I won’t confuse 
anyone if I also refer to that result as (L2) later in the book. 

(L3) Multiplying by a positive number. If a < b and c > 0 then 

ac < be. 
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I won’t prove (L3) as it’s quite easy. 

(L4) Multiplying by a negative number. If a < b and c < 0 then 

ac > be. 

Before we prove (L4) let’s discuss it. First of all it’s a little less obvious than 
(L3) as it tells us that multiplying by a negative number changes the direction of 
an inequality. If you think this is strange, it’s always best to do a few numerical 
experiments for yourself before going further. So let’s take a — —3 and b — —2. 
We know —3 < — 2 so a < b is satisfied in this case. Now choose c = —2. Then 
ac — — 2 x -3 = 6 and be — —2 x — 2 = 4. Of course 6 > 4 so ac > be as was 
promised. 

To prove (L4) we use (LI) and notice that b — a > 0 so it is a positive 
number. We’ve required c to be a negative number and we know that a positive 
number multiplied by a negative number is always negative (see Section 1.3). So 
c(b — a) < 0 (as it is negative). Now multiply out the bracket to get be — ac < 0 
and that tells us that ac > be which is what we wanted. 

(L4) takes a very special form when c = — 1. In this case it tells us that if a < b 
then — b < —a and we’ll use this quite often. You can see this as a consequence 
of the way that multiplication by — 1 acts as a reflection through a mirror as we 
discussed at the end of Chapter 1 and as shown in Figure 3.2. 

We’ve just seen that multiplication by negative numbers has the effect of 
reversing inequalities. Another way of doing this is by ‘inversion’. 


-b -a a b 


Figure 3.2. If a < b then — b < —a. 
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(L5) Inversion. If a and b have the same sign (i.e. both are positive or both 
are negative) and a < b then 

1 1 
b a 

Again before you prove this - try an experiment if it seems strange, e.g. check 
what happens when a — 2 and b — 4. 

To prove (L5) again use (LI) to transform a < b to b — a > 0. Now since a 
and b both have the same sign we have ab > 0. Now take c = ^ in (L3) and we 
have that > 0. The result we are seeking follows when we do a little algebra 
and observe that - — 1 = ^ > 0. 

a b ab 

Be aware that (L5) fails if a and b have opposite signs e.g. — 2 < 3 but I > ~\- 
You should be able to conjecture and then prove the replacement for (L5) in this 
case. 


(L6) Squaring. If a and b are both nonnegative then 
a < b if and only if 1 a 2 < b 2 . 

To prove this we use the well-known algebraic identity 
b 2 — a 2 — (b — a)(b + a), 

from which we see that the left-hand side is positive (or negative, respectively) if 
and only if the right-hand side is and since b + a > 0 the sign of b 2 — a 2 is the 
same as the sign of b — a. You should be able to see the rest from there. 

We have already pointed out that (L2) to (L6) have straightforward extensions 
to the case where < is replaced by <. They can also all be adapted to the cases 
where we have > and > (indeed these follow immediately by using the fact that 
> is defined in terms of < and > in terms of <) so that e.g. (L4) becomes 

If a > b and c < 0 then ac < be. 

Here’s a beautiful application of inequalities. We’ll show that between any two 
rational numbers (no matter how close they are) there are an infinite number of 
irrational numbers - so e.g. there are infinitely many irrational numbers between 
0.49999999 and 0.5. 

Theorem 3.1.1. Given any two rational numbers a and b with a < b we can find 
infinitely many irrational numbers q such that 

a < q < b. 

Proof. Let p be a prime number. Since p > 1 then Jp > 1 by (L6) and so 
< 1 by (L5). Now define q = a + -^(fe — a), q is irrational by I(i) and I(ii). 

1 lip and q are propositions then ‘p if and only if q means thatp implies q and q implies p. 
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0 a b 

Figure 3.3. The interval between a and b is shaded/ 1 


Since there are infinitely many prime numbers it follows that there are infinitely 
many numbers of this form. We will prove that a < q < p. Now q > a since 
q — a — -^=(fo — a) > 0 and q < b since 


b — q = b — a ( b — a) = (b — a) 

VP 



> 0 


and the result is established. 


□ 


3.2 Intervals 


Intervals are very important parts of the real number line. They work like this. Fix 
a and b so that a < b. An interval (see Figure 3.3) is the collection 2 of all numbers 
that he between a and b. That’s a bit vague as we haven’t said anything about 
whether the end-points a and b should be included. If we worry about these we 
get four different intervals that extend from a to b. 

(a, b) is called an open interval. It comprises all the points lying strictly between 
a and b but the end-points a and b themselves are NOT included. 3 

[a, b) and (a, b] are called half-open intervals. The first of these includes all of 
the points in (a, b ) together with a but not b while the second has b in it but a is 
excluded. 

[a, b] is called a closed interval. It contains all of the points in (a, b) as well as 
both end-points a and b. 

Note that, for example, 1 is in the interval (—2, 1] but it is not in [—2, 1). 
However 0 is in both of these intervals. We can also express intervals quite nicely 
using inequalities (which is one reason why I included them in this chapter). So 
a number x is in the interval ( a , b) if and only if x > a and x < b. Whenever we 
have two inequalities like this it is notationally convenient to combine them and 
write a < x < b which translates precisely as ‘x is larger than a but smaller than 
b’ (or 'x is between a and b’) and this is precisely what it means to be inside the 

2 Technically speaking we should say 'set' (see Appendix 2). 

3 Some mathematicians prefer to use the notation ]o, b[ for open intervals and [a, b[ 
and ]o, b] for the half-open intervals [a, b) and (a, b], respectively. 

k I deliberately haven't said if the interval is open or closed as you can’t easily show this on 
a diagram. 
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interval (a, b ). We won’t do anything more with intervals for now - but we’ll meet 
them again later on. 


3.3 The Modulus of a Number 


When we’re dealing with numbers there are occasions when the sign of the 
number, i.e. whether it is positive or negative, is absolutely crucial, and other 
times when all that matters is the magnitude or size of the number. To deal with 
the latter case we introduce a neat notation |*| and we call this the modulus of the 
real number*. So e.g. |7| = 7 but | — 23| = —(—23) = 23. A slick way of defining 
the modulus in terms of — , > and < is 


|*| 


* if* > 0 
— * if * < 0, 


so that in particular |0| = 0. Notice that the definition takes advantage of the fact 
that if * is negative, then * = — 1*|. 

It’s worth pointing out that we get two very obvious (but sometimes useful) 
inequalities from this definition: 


* < |*| and — * < |*| 


(3.3.3) 


Figure 3.4 shows a graph of |*| plotted against *. 
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—a 0 a 


Distance is |«| Distance is |o| 

Figure 3.5. The modulus as a length. 


The way in which the modulus interacts with the + operation gives rise to one 
of the most important inequalities in the whole of mathematics. It is called the 
triangle inequality and we’ll present it as a theorem. 

Theorem 3.3.1. For all real numbers a and b, 

\a + b\ < \a\ + \b\. (3.3.4) 


Now before we prove this theorem, let’s make a few remarks. The first question 
you might ask is - why is it < here? Why not =? After all if you put a = 2 and 
b = 1 the left-hand side and right-hand side of (3.3.4) are both 3. But on the other 
hand if you put a — 2 and b — — 1 then the left-hand side is 1 and the right-hand 
side is 3 so it’s a clear < in this case. 

Secondly you might ask - why is (3.3.4) called the ‘triangle inequality’? Well 
first of all, it’s useful to think about (3.3.4) in terms of the real number line. The 
modulus of the real number x has a nice geometric meaning here - it is simply the 
distance of the number x from the ‘origin’ 0, or equivalently the length of the line 
segment joining 0 and x (see Figure 3.5). So (3.3.4) is telling us that the length of 
a + b can never exceed the sum of the lengths of a and b. 

Does this remind you of anything to do with triangles? If you know the result 
that in any triangle the length of the longest side can never be greater than the 
sum of the two other sides, then you’re spot on. In fact that result can also be 
expressed in the form (3.3.4) when \x\ is re-interpreted as the length of a vector 
in two-dimensional space (see Figure 3. 6). 5 

By the way, the interaction between the modulus and the x operation is much 
simpler and I’ll leave it to you to show that for all real numbers a and b, 

\ab\ = |a||b|. (3.3.5) 


a 1 

Note that as - = a x -, we get 
b b 


— — as a free gift from (3.3.5). 
\b\ 


Now we’re ready to give the proof of Theorem 3.3.1. Because this result 
is so important we’ll give not one but two proofs. Is this overdoing it? Well 


5 The inequality (3.3.4) also extends to higher and even infinite-dimensional spaces when 
|x| is given a suitable meaning. 
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mathematicians quite often have more than one proof of important results. 
A different proof can sometimes give new and important insights or might display 
greater elegance or beauty. 6 

Proof 1 of Theorem 3.3.1. This proof works by exhausting all four combinations 
of signs that a and h can have. 

(i) a > 0, b > 0. In this case 

\a + b\ — a + b = \a\ + \b\. 

(h) a < (),b < 0. This works in more or less the same way as (i). 

(hi) a < 0, b > 0. We then have a — — \a\. Now either b > \a\ or b < \a\. 

If b > \a\, \a + b\ — b — \a\ < b + \a\ = \a\ + \b\. 

If b < \a\, \a + b\ = \a\ — b < \a\ + \b\. 

(iv) a > 0, b < 0. This is proved the same way as in (iii). 

The inequality (3.3.4) holds in all four cases. There are no other possibilities to 
consider and that completes the proof. □ 

Proof 2 of Theorem 3.3.1. Now since the square of a number is always 
nonnegative it follows that 

| a + b\ 2 = (a + b) 2 

= a 2 + lab + b 2 
= \a\ 2 + lab + \b\ 2 . 


6 The great mathematician Carl Friedrich Gauss (1777-1 855) gave eight distinct proofs 
in his lifetime of a result about numbers called the 'quadratic reciprocity theorem' and he 
published the first of these at the age of nineteen, see e.g. http://mathworld.wolfram.com/ 
QuadraticReciprocityTheorem.html 
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Now using (3.3.3) and (3.3.5) we get 

ab < \ab\ = \a\\b\, 


and so we see that 


\a + b \~ < \a\~ + 2|a||b| + \ b\ 2 — (|a + b |) _ . 

Now use (L6) to get \a + b\ < \a\ + \b\ and our proof is complete. □ 

Which proof do you prefer? Proof 2 is the one that usually appears in 
mathematical textbooks and it has the distinct advantage over Proof 1 of 
generalising directly to the higher-dimensional case, with the only necessary 
change being that |x| is interpreted as the length of a vector. 

We’ve now dealt with the interaction of the modulus with x, -F and +. The 
next question is what happens with — ? I’m constantly surprised how many 
undergraduate students believe that the triangle inequality can be ‘stretched’ 
to give ‘| a — b\ < \a\ — \b\\ Of course this is WRONG! Just try a — 2 and 
b — — 3 then the left-hand side is 5 while the right-hand side is —1. This at 
least suggests that maybe the inequality should be reversed so that we have 
| a | — \b\ < | a — b\. That is correct, however it turns out that the order of \a\ 
and \b\ is unimportant here and that we also have \b\ — \a\ < \a — b |. Now since 
\\a\ — \b\\ is either equal to \a\ — \b\ or \b\ — \a\ we might just as well prove 
the following - which is expressed as a corollary as it follows so easily from 
Theorem 3.3.1. 

Corollary 3.3.1. For all real numbers a and b, 

\\a\-\b\\<\a-b\. (3.3.6) 

Proof. We’ll use the mathematician’s favourite trick (MFT) followed by the 
triangle inequality to get 


\a\ — \a — b + b\ 

<\a-b\ + \b\. 

Now use (L2) (with c — — \b\) to get 

\a\ — \b\ < | a — b\. 

Now repeat the argument that we’ve just used but with the roles of a and b 
interchanged. You then get 

\b\ — \a\ < \b — a\ = \a — b\, 


and that’s it. 


□ 
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3.4 Maxima and Minima 


Here’s another very short but useful little topic that finds its home in this chapter. 
Suppose that we have a list of N real numbers that I’ll call a l ,a 2 , , a N . These 
are not in any particular order and the subscript 1,2 , ... ,N is just used as 
a convenient device to distinguish the numbers from each other. Unless they 
are all equal to the same number, one of the numbers on the list will be the 
largest and one will be the smallest. The largest one is called the maximum 
and the smallest one is the minimum. We’ll shorten these to max and min 
respectively so ma x(a p a 2 , . . . , a N ) picks out the largest number on the list and 
min(a 1; a 2 , ... , a N ) identifies the smallest . To see how these are used in practice 
suppose that we have the following list: —3, 1,9,4, —7. Then 

max(— 3, 1, 9, 4, —7) = 9 and min(— 3, 1, 9, 4, —7) = —7. 

In general we have the following rather obvious inequalities: for all i running 
from 1 to N: 


min (aj, a 2 , . . . , a N ) < a t < ma x(a 1 . a 2 , . . . , a N ). 

It’s also worth pointing out a nice link between the modulus and the maximum: 

|x| = max(x, —x). 


for any real number x. 


3.5 The Theorem of Che Means 


In this section we’ll give an example of a very interesting inequality which holds 
between two different types of average. Before we start to develop this it may be 
worth making a few remarks about nth roots. These are defined very similarly to 
square roots (see Section 2.2) but the number ‘two’ is replaced by an arbitrary 
natural number n > 2. So if x is a positive real number we define $ 'x to be the 
unique positive real number y for which y n = x. Indeed it can be shown that 
such a y always exists and if n is odd then y is the unique real number for which 
y n = x. If n is even then — y is also a solution to this equation. If x is a natural 
number, then just as was the case with square roots, tfx is always an irrational 
number unless x = a n for some other natural number a, in which case tfx — a , 
e.g. -^81 = 3 since 3 4 = 81. 

Now let’s investigate averages. If you’ve attended just a very basic course in 
statistics you will have come upon the arithmetic mean which is often just called 
the mean or the population mean. To define it precisely let’s suppose that we are 
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given n real numbers a x , a 2 , ■ ■ ■ , a n . Their arithmetic mean which I’ll here denote 
by A n is defined by 


A 


n 


cii a 2 a n 

n 


(3.5.7) 


So for example if these were the heights of children in a school class, the 
arithmetic mean would be a measure of their average height and we might like to 
compare this to national data on heights for this age group to see if this group of 
kids was ‘normal’. 

The arithmetic mean is based on addition. By contrast, the geometric mean G n 
is constructed using multiplication and we define this by 


G„ = j/a 1 a 2 ---a n . (3.5.8) 

Why do we call it geometric? Well if n — 2, G 2 = +Ja l a 2 is the length of the 
side of a square that has the same area as the rectangle with sides and a 2 , if 
n — 3, G 3 = f/ a 1 a 2 a 3 is the side of a cube that has the same volume as a box with 
sides «j, a 0 and a 3 , and so it continues to higher dimensions. It’s interesting to 
compare G„ and A n . First of all observe that if a 1 — a 2 — ■ ■ ■ — a n = a, then 

YlCl yi / 

A n = — — a — Vo" = G„, (3.5.9) 

n 

so the two means are always equal in this case. Now let’s look at the case where 
n — 5 with flj = 3, a 2 = 5, a 3 = 1, a 4 = 9 and a 5 — 11. Here you can check that 
A 5 = 5.8 while G 5 = 4.309 (to 3 decimal places of accuracy). If you check with 
different sets of distinct numbers, you should find that the geometric mean is 
always smaller than the arithmetic mean. So that’s our conjecture and we should 
aim to prove it. 

Before we do this we will add one more weapon to our arsenal. We need to 
extend (L6) for squares to nth powers. This works fine and we have (the e here 
stands for ‘extended’) : 


(L6e) Suppose that a and b are nonnegative real numbers. Then a n < b n if and 
only if a < b. 7 The next result is the celebrated Theorem of the Means. 


Theorem 3.5.1. The geometric mean of n real numbers can never exceed the 
arithmetic mean, i.e. given n real numbers a 1 , a 2 , . . . , a n 


Zl*\a 2 ---a n < 


a \ T a 2 + • • • + a n 
n 


1 To prove this you need the algebraic identity 

b n - a n = (b - a) (b n_1 + ab n ~ 1 2 + o 2 b n_3 H E a n ~ 2 b + a" - ' ). 
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Proof. To make our job simpler, we’ll assume that all the a ; s are unequal and 
that none of them are zero. In fact if any a, = 0 then 0 = G n < A n . I’ll leave it to 
you to figure out how the proof I’ll give below should be tweaked if two or more 
fl,-s are equal. 

(*)Let a T = maxlflj, a 2 , . . . , a n ) and a B = min(flj, a 2 , ■ ■ ■ , «„)• Now form a 
new set of numbers a \ , a' 2 , , a' n . The only difference between these numbers 
and the original ones is that a T has been removed and replaced by G„ and also a B 


has been taken away and replaced by T B . Let G' n and A' n denote the geometric 


and arithmetic means (respectively) of this new set of numbers. Now a little bit of 
algebra should easily convince you that G' n = G n . It takes a bit more work to see 
how A' n relates to A n so let’s do that now. 

Since a B < a l a 2 ---a n < a B it follows from (L6e) that a B < G n < a T . So 
a T — G n > 0 and a B — G < 0. Multiplying these together we have 


(a T — G n )(a B — G n ) < 0 


and expanding the bracket we get 

a T a B — G n {a T + a B ) + < 0, 


and if we rearrange this (using (LI) and (L2)) we obtain 


Uj- Ut> > 


Ct r r(X 


T U B 


G„. 


Now notice that the left-hand side of this last expression is exactly the sum 
of the two terms we removed from the original list while the right-hand side is 
the sum of the two terms we replaced them with. When you substitute into the 
expressions for the two arithmetic means we see that this tells us that A' n < A n .(*) 
Now I want you to think of the part of the proof that we’ve carried out so far 
and which is sandwiched between the two (*)s as a single procedure. The next 
thing to do is to carry out this procedure again so we let a' r — ma x(a[ ,a' 2 , ... , a' n ) 
and a' B = min(flj, a 2 , . . . , a!f) and obtain a new set of numbers a ", a 2 , 
where a' T has been removed and replaced by G„ and a' B has been taken away and 


replaced by 


If we repeat the reasoning that we carried out above we’ll find 


that the arithmetic mean has changed to A" < A n while the new geometric mean 
is G" = G„. Now after two applications of our procedure, two of the numbers on 
the original list have changed to G n . 

Maybe you can guess what happens next. Mathematicians call it iteration and 
use this to describe any procedure which is applied repeatedly and mechanically 
towards some goal. So after three applications of the procedure, the arithmetic 
mean will again be smaller, the geometric mean will stay the same and three 
numbers in the list will take the value G„. Now we keep going until the nth stage. 
When we reach this stage, every number on the list is equal to G n which is the 
arithmetic mean of that list by (3.5.9) but this is smaller than the arithmetic mean 
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a 


R b 


'lab 


^ lab 


S 1 lab 


A lab 


Figure 3.7. Square having the same area as a given rectangle. 


at the (n — l)th stage which is smaller than that at the ( n — 2)th stage which is 
(working backwards through the iterations) certainly smaller than A n . So we’ve 
shown that G n < A n , as was desired. □ 

The theorem we’ve just proved can be strengthened to show that G n = A n if 
and only if all the numbers on the list are equal. We’ve proved part of this already 
(see (3.5.9)). You can prove the other part yourself by contemplating the way the 
proof of Theorem 3.5.1 works and thinking about what would happen if just two 
of the a,s were to be unequal. 

The Theorem of the Means has an interesting geometric interpretation. Let 
n — 2 and consider the rectangle R with sides of length a 1 and a 2 . The perimeter 
of R is 2(a 1 + a 2 ) and by Theorem 3.5.1 we have 

2(flj + a 2 ) = 4 ( - I > 4 

Now \^/a x a 2 is precisely the perimeter of a square S having sides of length 
Ja x a 2 , as shown in Figure 3.7. Since both the square S and the rectangle R 
have the same area cr 1 a 2 we see that the Theorem of the Means has told us that 
out of all possible rectangles that have the same area, the one with the smallest 
perimeter is the square. A similar geometric interpretation can be found in higher 
dimensions, see e.g. http://en.wikipedia.org/wiki/Inequality_of_arithmetic_and_ 
geometric_means#Geometric_interpretation. 


3.6 Getting Closer 


One of our main reasons for learning about inequalities in this chapter (apart 
from their intrinsic interest) is so that we can use them in the next chapter in 
our study of limits. As a key step towards that goal we need to understand how 
mathematicians make sense of the concept of ‘closeness’. To make this more 
precise, suppose we are given a fixed real number - let’s call it /. We would like to 
say what it means for other numbers to be arbitrarily close to l without necessarily 
being equal to l. 
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Z-E l /-£ 

1 r 1 2 r 1 

£ £ 

Figure 3.8. Interval of width 2e centred on /. 


To get a feel for this idea let’s take 1=1. Suppose that I want to identify all the 
numbers that are within 0.01 of l. Then a moment’s thought tells me that I need 
to be in the open interval (0.99, 1.01). Similarly if I wanted to be within 0.001 of 
1 I would need the open interval (0.999, 1.001). 

The numbers 0.01 and 0.001 are playing the role of ‘degree of closeness’ here. 
When we return to the case of a general real number l we will need a symbol 
to represent this degree of closeness and mathematicians have chosen the Greek 
letter, which is denoted by e and pronounced epsilon, to fulfil this role. So if we 
want to get within e of the real number l we must choose numbers that are in 
the open interval (l — e, l + e) as shown in Figure 3.8, which is exactly what we 
did in the numerical examples where l = 1 and we took e = 0.01 and e = 0.001, 
respectively. 

How can we express the fact that a real number x is close to l with e measuring 
the degree of closeness? As we saw in Section 3.2 we must have 

l — e < x < l + e. (3.6.10) 

Of course this is two inequalities rolled into one. The one on the left can be 
rearranged using (L2) to give l — x < e and the one on the right similarly yields 
x — l < e, so we conclude that 


\x — l\ < e. (3.6.11) 

The fact that (3.6.10) and (3.6.11) are equivalent may just seem like pointless 
manipulation at the moment - but as the next chapter unfolds we will see how 
useful it is. 


3.7 Exercises for Chapter 3 


1 . Show that if o, 5 and c are real numbers for which a < b and b < c then a < c. 
[Hint: Remember that a < c means the same as c — a > 0. Now introduce b 
using the MFT.] 

2. Prove that if o, b, c and d are real numbers satisfying 0 < a < b and 0 < c < d 
then ac < bd. 
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3. Find all values of x for which < 1 . [Hint: Treat the cases x < — 3 and 
x > -3 separately.] 

4. Find all values of x for which | | < 1 . [Hint: Square both sides.] 

5. Show that for all real numbers a, b, c, d with |c| \d\ we have 

a + b |o| + |t>| 
c + d ~ ||c| - la'll' 

Do you think that either of the inequalities | ^ | < j°j^j or j ^ | < M+jkj j s 
also true? Either present a proof or a counter-example in each case. 

6. Let a and b be arbitrary real numbers. Use the fact that we always have 
(a - b) 2 > 0 to show that ab < \{a 2 + b 2 ). Hence deduce that (a + b) 2 < 
2 (a 2 + b 2 ). How do you think that this last inequality might generalise when 
the left-hand side is replaced by (a^ + a 2 + • • • + a n ) 2 for real numbers 

O | , , . . . , 

7. Use the binomial theorem (see Appendix 1 ) to prove Bernoulli's inequality: 

(1 +x) r > 1 +rx, 

where x > 0 and r is a natural number. For which natural numbers r is this 
inequality strict (i.e. > can be replaced by >)? [If you know the technique of 
mathematical induction (see Appendix 3 ) you can try proving that the inequality 
holds for all x > — 1 .] 


8. Consider the quadratic function 

7(x) = ox 2 + t>x + c, 

where a, b and c are real numbers with a > 0. Show that 


7(x) = a 



+ 


4ac - b 2 
ka 2 


and hence deduce that 7(x) > 0 for all x if and only if b 2 < kac. 

9. Recall the ‘summation notation' or 'sigma notation' 

n 

^ a i = Q-] + a^ “F • • • “F a n 

i= i 

(see Chapter 6 if you need to learn about this). A very famous and useful result 
is Cauchy's inequality for sums: If a, , a 2 , . . . , a n and fcq , b 2 , ■ ■ ■ , b n are real 
numbers then 
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Verify chat Cauchy's inequality is correct by applying the results of Exercise 8 to 

n 

the quadratic function F(x) = ^(o ( x + b,) 2 . 

/=! 

10. The next inequality may appear to be rather unexciting but we will use it in 
Chapter 5 to investigate square roots of prime numbers. Let p > 0 and lety be 
any other positive real number for which y 2 > p. Show that 

p -l{ ,+ i) ^ 

[Hint: The right-hand inequality is straightforward. For the left hand one, use 
the fact that if u and v are any two real numbers then 

[u + v) 2 - kuv = (u- v) 2 > 0.] 
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My Lovely? 


When the successive values attributed to a variable approach indefinitely a fixed 
value so as to end by differing from it by as little as one wishes, the last is called 
the limit of all the others. 

A.-L. Cauchy quoted in A History of Mathematics, C.B. Boyer, U.C. Merzbach 


4.1 Limits 


I n this section, we are going to meet the most important concept in this book. 

It may also (without fear of hyperbole) be one of the most essential ideas in 
the whole of mathematics. This beautiful and profound notion is the concept of 
the limit and its importance stems from the fact that it allows us to capture the 
infinite within the finite. 

The limit pervades the subject of analysis but the simplest place to meet it is 
within the study of sequences. These are simply lists of numbers such as (SI) to 
(S4) which appear below: 


(SI) 1, 


111111 

2’ 3’ 4’ 5’ 6’ 7 ’ * ’ ' 


l t 33 14 85 , 161 52 261 

'■ aZ ' 1, Z, ^ , 5 , 29 , 4 , 53 , 17 , gs > • • ■ 

(S3) 1, 1,2, 3, 5, 8, 13, 21, 34, 55, 89, . . . 

fC4l 1^3581321345589 

Z ’ 2’ 3’ 5’ 8 ’ 13’ 21 ’ 34’ 55’ ‘ ’ 


Each of these shows the beginning of a list that extends indefinitely. Of course 
there are many (an infinite number of!) ways to carry on each list in each case, but 
in mathematics we usually expect that there is a pattern expressed by a formula 
that enables us to calculate any term on the list (at least in principle). The pattern 
for (SI) is pretty easy to identify. In this case the first term on the list is j, the 
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second term is 3 , the third term is \ and so on. Hence the nth term is - and we can 

2 ’ 3 n 

assert immediately (without having to count along to that point) that the 1024th 

termis Tck- 

The pattern in (S2) is more difficult to discern and that’s probably because 
it’s one that I manufactured myself by starting with the formula for the nth term 
. So for example, to get the 6th number on the list I put n — 6 into this 
formula to get 12 +*° 8 = ^ = 3. The third and fourth lists are more famous and 
interesting examples. (S3) Is the Sequence of Fibonacci numbers. It was studied by 
Leonardo of Pisa (c.1180-1250) (who was also known as Fibonacci) and appears 
in his book Liber Abaci which was published in 1202. 1 It describes an idealised 
model of how a population of rabbits might grow. In this case the nth term in the 
sequence is the number of pairs of rabbits at the end of the nth month. So if we 
start with one pair of rabbits (one male and one female) at the beginning then 
there is still one pair at the end of both the first and second months, but by the 
end of the third month these rabbits have bred and the population doubles. There 
is a very simple formula that can be used from this point on to calculate^,, the 
number of pairs of rabbits at the end of month n, from the numbers at the end of 
the previous two months. It is 


fn — fn- 1 +fn- 2 > (4.1.1) 

so for example, I finished the list in (S3) with/ 10 = 55 and/ u = 89 from which 
I can then calculate/ 12 = 55 + 89 = 144 and/ 13 = 89 + 144 = 233. Incidentally 
any sequence like (4.1.1) in which later values are calculated in terms of earlier 
ones is called a recursion. 

The sequence (S4) is closely related to (S3). We won’t say much about it now 
but we will come back to it later as it is associated with a very famous number - 
the golden section (see page 22) and we’ll get to that number by taking limits! 
For now it will suffice to just spot the general formula, so if r n in the nth term 
is the sequence then you can check that it is the ratio of successive Fibonacci 
numbers, i.e. 



We need to be able to develop a precise mathematical procedure for working 
with sequences. We will formally define a sequence to be a list of real numbers that 
is indexed by the natural numbers. 2 The generic sequence will be denoted by (a n ) 
and the number which appears as the nth term is a n . Please do not get confused 
between ( a n ) and a n . ( a n ) is shorthand for the list of numbers a 1 , a 2 , a 3 , ... which 


1 See e.g. http://en.wikipedia.Org/wiki/Fibonacd-#Fibonacci_sequence 

2 This is a working definition which is fine for this book - but a more precise definition is 
that a sequence is a mapping (or function) from the (set of) natural numbers to the (set of) real 
numbers. 
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Figure 4.1. The harmonic sequence. 


goes on forever. I stress that it is a list and not a number. On the other hand a n 
really is a number, e.g. in (S2), a 4 — y = 2.8. 

The key question we are now going to turn our attention to is - what happens 
to a sequence (a n ) as n gets very large? That’s a vague question and we’ll need 
to explore a little further before we try to answer it. We begin by looking more 
closely at the sequence (SI). It is called the harmonic sequence (Figure 4.1). This 
terminology comes from the relationship between numbers and music that goes 
back at least as far as the Pythagoreans. 3 They observed that if you pluck e.g. a 
guitar string at a | etc. of its length then the pitch increases and you create 
a succession of notes with higher frequencies. The terms in the sequence look as 
though they are getting closer and closer to zero as n gets larger and larger but 
observe that it can never reach zero. Indeed if there was a natural number N which 
had the property that ^ = Owe could multiply both sides by N to see that 1=0, 
which is a contradiction. 

When we look at (S2), we see that both the numerator and the denominator 
of a n = get larger and larger as n grows so it appears that the sequence is 

being attracted towards the ‘number’ ^ which we know to be undefined! On the 
other hand, we can calculate a n = fff = 3.08108 . . . , a 100 = = 3.01879 . . . 

which provides at least some evidence that we should interpret — as 3 in this case. 
In (S3) the story seems much simpler, as n gets larger then so does^j,, but (S4) 
seems to be behaving more like (SI) and (S2) and you should check that the terms 
appear to be closing in on a number that is in the region of 1 .618. 

We will now try to pin down more clearly the behaviour that we’ve identified 
in (SI), (S2) and (S4). We’ll introduce two new terms. The first of these is the idea 
of convergence and at this stage we will say that the sequence ( a n ) converges if 
when n gets very large, the term a n gets arbitrarily close to a real number l. In this 
case we will call l the limit of the sequence («„). We have already seen in (SI) that 
the limit (which is 0 in that case) may never be reached, but what does happen 
there is that as n gets larger and larger we get arbitrarily closer to 0. We have not 
yet given adequate mathematical definitions of the concepts of convergence and 
limit. To achieve this we need to capture the notion of arbitrary closeness more 
precisely. We came some way towards doing that in Section 3.6, so if we fix a 
degree of closeness e then it seems that \a n — l\ < e is what we are looking for. 
But do we want this to hold for all values of n? If we do this in (SI) and take n — 1 


3 This term usually refers to followers of the mathematician/philosopher Pythagoras who 
were active in the period from about 585 to 400 BCE. 
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and we believe that l — 0 we get \a 1 — l\ = 1 — 0 = 1, which isn’t very small. So 
what is missing from our definition? Well if you think about it I hope you’ll agree 
that we can’t have ‘arbitrary closeness’ unless we have journeyed far enough along 
the sequence. How do we express ‘far enough along’ mathematically? How about 
this? Suppose we fix a natural number n 0 as a starting point and argue that for all 
n > n 0 we must have arbitrary closeness. This will nearly work but what is still 
missing is a connection between the two numbers: e that measures closeness and 
n 0 which identifies the point at which closeness kicks in. But maybe it’s time now 
to stop this discussion and give the definition which we’ve been building up to. 
This is the key definition in this book, so let’s concentrate very carefully on what 
it says: 

The sequence (a n ) converges to the real number l if given any e > 0 
there exists a natural number n 0 so that whenever n > n 0 we must have 
K -l\ <e. 

If such a number l exists we call it the limit of the sequence (a„) and we write 

l — lim a n . (4.1.3) 

n — > oo 

An alternative (and completely equivalent) notation to (4.1.3) is 
a n — > l as n — * oo. 

This is a subtle and powerful definition. It takes some time to internalise so 
don’t worry if you feel like you don’t understand it yet. I can assure you that if 
this is the case then you are in very good company. I also want to stress that the 
notation n — * oo is a symbolic one that is supposed to be suggestive of approaching 
infinity but not of reaching it (whatever that means) so that the problems that we 
encountered in Section 2.4 are irrelevant here. The definition of convergence links 
the two numbers e and n 0 in the following way. As e measures degree of closeness 
we can take e to be as small as we like, but the smaller we take e, the larger we must 
take n 0 as we then have to go further along the sequence to find those numbers n 
for which K -l\<€. 

For example consider (SI) again andlet e = 0.12. Toget \a„ — 1\ — | ^ — 0 1 <e 
in this case, we require f < 0.12, i.e. n > ^ — 8.333 ... So here we can take 
n 0 = 8 and the definition is satisfied - but only for this particular choice of e. If 
we instead take e — 0.012, you can check that we then need n 0 to be at least 83 
and n 0 = 833 or more is required if e = 0.0012. Do these numerical calculations 
enable us to prove convergence? No they don’t, as the definition requires us to 
consider all e > 0 and this is an infinite number of cases to consider. Playing 
with specific numbers helps us to get a feel for how the definition works, but to 
legitimately prove convergence we need to be cleverer! 

Example 4.1: To show lim„^ oc ^ = 0. 

We begin with the first line of the definition and choose an arbitrary e > 0. 
Now consider i. This is a real number (which is large when e is small) and so 
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it has a decimal expansion ] - — n 0 .n 1 n 2 n 3 ■ ■ ■ . So - > n 0 and by (L5) we have 
4- < e. So given any e > 0, if n > n 0 we have 


1 

0 

n 


1 1 

- < — < e, 
n n 0 


i.e. i < e so this choice of n 0 satisfies the definition and we may ‘legitimately’ 
write lim^oo ^ = 0. 


I’ve put the word legitimately in quotes as although this ‘proof will do for now, 
it isn’t really good enough. That’s because we’ve had to use a working definition of 
a real number as one that has a decimal expansion in order to extract our n 0 from 
our e. To give a fully rigorous proof we’ll need a more sophisticated understanding 
of real numbers and this must be postponed for now. The good news is that none 
of the general proofs in this chapter will require this deeper insight into the real 
line and they are completely rigorous as they stand. In Chapter 1 1 we will prove 
the Archimedean property of the real numbers and that gives the firm foundation 
that we need for Example 4.1. 

The first general theorem that we’ll look at concerns uniqueness. Mathe- 
maticians worry quite a lot about this concept and if you think about it, they 
are right to. If a sequence were to converge to more than one limit - then which 
one is the right one? Fortunately, as the theorem below shows - that can never be 
an issue. 


Theorem 4.1.1. If a sequence converges to a limit, then that limit is unique. More 
precisely if («„) is a sequence such that lim,,^^ a n — l and lim,,^^ a n = V then 


l = V. 


Proof. We’ll use a proof by contradiction, Suppose that l and l' are as in the 
statement of the theorem with l V. By definition of convergence, given any 
e > 0 there exists a natural number n 0 such that if n > n 0 then \a„ — l\ < |, (why 
| and not e? We’ll return to this point in the discussion after the proof) and there 
also exists a natural number m 0 such that if n > m 0 then \a n — l'\ < |. Now we’ll 
use MFT (the mathematician’s favourite trick) and the triangle inequality. Ensure 
that n > max(m 0 , n 0 ) so that we have closeness to both limits. Then 

1 1 — l'\ = 1 1 ~ a n + a n ~ H 

< \1 — a n | + | a n — l'\ 


€ e 



So we’ve shown that for any e > 0, \l — l'\ < e so that 1 1 — V\ is smaller than 
any positive number. By properties of the modulus we also know that \l — l'\ > 0 
and so the only possibility is that \l — l'\ =0, i.e. I — V and this is our required 
contradiction. □ 
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It’s a good idea to read through the proof of Theorem 4.1.1 a few times to 
make sure you really understand it. This will serve you in good stead for later as it 
contains many features that are typical in analysis proofs. You should understand 
why m 0 and n 0 cannot be chosen to be the same (if l and l' really were different, 
you may need to go further along the sequence in one case than the other in order 
to reach the desired degree of closeness). Finally why did I use | instead of e? Well 
first of all, there is a sense in which it doesn’t really matter. Remember that the 
definition of a limit l of a sequence ( a n ) tells us that given any e > 0 there exists 
n 0 such that if n > n 0 then \a n — l\ < e. Now we can certainly replace e here with 
| (or indeed Ke for any fixed real number K) since being ‘given any e > O’ is the 
same as being ‘given any ! > 0> (think about it!) Of course | is smaller than e so 
imposing a smaller degree of closeness requires us to choose larger m 0 and n 0 - 
but so what? The reason for using § rather than e is one of mathematical style and 
elegance. If we’d used € then we’d have finished the proof with \l — Z'| < 2e and 
this looks ugly to the trained mathematician. Doing Exercise 4.9 at the end of this 
chapter will help you to clarify this issue. 

Let’s move on from these finicky points to consider another example. In the 
following we’ll fix — 1 < r < 1. 

Example 4.2: To show r n = 0 (if — 1 < r < 1). 

Before we go further, let’s look at a concrete example and take r = As n gets 
larger so does 2" and so ^ gets smaller and smaller so the result that is claimed 
looks feasible. 

To solve the problem we need to apply a case-by-case method: 

Case 1:0 < r < l.To prove the result in this case, we note that as r < 1, we 
have i > 1 by (L5). It turns out to be a good idea to write t = 1 + h where h > 0. 
Then using the binomial theorem (see Appendix 1 if you need some background 
on this) we get 

1 / 1 \ n 1 

— = - = (1 + h) n = 1 + nh+ -n(n - 1 )h 2 + --- + h n , 

r n \r ) 2 

so that in particular - since all terms on the right-hand side are positive, we have 4 

— > nh 
r" 


k If you don’t want to use the binomial theorem it’s enough to notice that 
(1 +h) n = ( 1 +/J)(1 +/))•••( 1 + h) > nh 
n times 

as when you multiply out the n brackets you always get a term where the 1 in (n - 1 ) of the 
products of (1 + h) meets the /7 in the final (1 + h) and there are n ways in which this happens. 
Note that we have essentially proved Bernoulli’s inequality from Exercise 3.7. 
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and so by (L5) again 

In Example 4.1, we showed that lim^^ 1=0, and so given any e > 0, there 
exists n 0 such that if n > n 0 then 1 < eh. 5 So if n > n 0 we see that 

r" < -,eh = e, 
h 

and the proof is complete. 

Case 2: — 1 < r < 0. This is easy. We just repeat the above reasoning with r 
replaced throughout by |r| as 0 < \r\ < I- 

Case3:r = O.Ifr = Othenr" = 0 andif svery easy to prove that lim,,^^ 0 = 0- 
indeed something would be very wrong with the definition of convergence if this 
wasn’t so. This completes the proof. 

We can generalise the argument of case 3 to show that the constant sequence 
which is such that a n = c for all n, where c is a fixed real number, converges to c. 
So we see that lim^^ c — c. In particular if we apply this to the case c — 1 we can 
extend the convergence of the sequence (r n ) to allow r = 1, but in this case the 
limit is 1 and not 0. By symmetry, we might expect to be able to include r = — 1 
as well, but the argument breaks down here as (—1)" = 1 when n is even and is 
— 1 when n is odd. So the sequence is— 1,1,— 1,1,— 1,1,... and there can be no 
possibility of convergence to a limit. 

We haven’t looked systematically yet at sequences that fail to converge and 
this is an opportunity too good to miss. Here’s a formal definition - a sequence 
is said to diverge if it doesn’t converge. This is not so much a definition, as a 
piece of terminology. The name ‘divergence’ has stuck in the literature but seems 
like something of a misnomer as intuitively we’d expect divergence to mean 
‘inexorably moving away’ or ‘getting larger in size’. In fact divergence can be 
subdivided into more than one type of behaviour and in two cases, there is some 
reconciliation with intuition. Before we give the definitions, let’s consider two 
examples. The sequence (n) = 1, 2, 3, 4, 5, ... is just the natural numbers. They 
grow inexorably larger and we know there is no end-point. To capture this type 
of behaviour we say that a sequence (a n ) diverges to+oo if, given any real number 
K > 0, there exists a natural number n 0 such that a n > K for all n > n 0 . This is 
rather like the definition of convergence except that then we were dealing with 
degree of closeness and we required a number e that could be made arbitrarily 
small. In place of e we now have K that can be made arbitrarily large, but no 

5 The fixed number h will play exactly the same role here that \ did in the proof of 
Theorem 4.1.1. 
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matter how large it is taken, if we go far enough along the sequence we can find a 
point after which all the a n s exceed K. Notice that the use of oo in the definition 
is just as part of a name. There are no infinite numbers involved in the definition 
itself. But what about the sequence (— n) of negative integers? It clearly diverges 
and has a similar behaviour to n but it is going in the wrong direction for the 
definition we’ve just given. In order to accomodate this sort of behaviour, we say 
that a sequence ( b n ) diverges to — oo if given any real number L > 0 there exists 
a natural number m 0 such that b n < — L for all n > m 0 . Finally we say that a 
sequence is properly divergent if it diverges to either = oo or — oo. 

What about the sequence (—1)"? It diverges but certainly not to ±oo. 
A sequence that displays similar behaviour, i.e. endlessly cycles between two or 
more numbers is said to oscillate. In the case of (—1)", we see that it oscillates 
finitely as there are only a finite number of possibilities (two in this case). We’ll see 
an example below of a sequence that oscillates infinitely. 6 

Example 4.3: The divergence of the sequence (r”) if r < — 1 or r > 1. 

We will show that (r") diverges to +oo if r > 1. We do this by a similar 
argument to that used in Case 1 of Example 4.2. We write r = 1 + h where h > 0. 
Using the binomial theorem as before (or arguing directly as in footnote 4 within 
Example 4.2) we conclude that r" = (1 + h) n > nh. Now choose K > 0 to be 
as large as you like and write the real number f — m 0 .m l m 2 m 3 • • • We take 
n 0 — m 0 + 1, then for n > n 0 , we have 

r n > n 0 h — ( m 0 + 1 )h > —.h = K , 
h 

and we are done. 

If r — —1, we have already seen that the sequence oscillates finitely. Finally if 
r < — 1, the sequence oscillates infinitely. We won’t prove this but if you take e.g. 
r — 2, you see that (—2)" = —2, 4, —8, 16, —32, . . . and the behaviour is quite 
clearly different from that of (—1)". 

Even in as simple a sequence as (r") we see that we encounter all but one of the 
different notions of divergence as well as convergence, when we vary the value 
of r. It’s perhaps worth listing the results of Examples 4.2 and 4.3 so that we can 
examine them together. 


4.1.1 The story of the sequence (r n ). 

• If r < — 1, the sequence oscillates infinitely. 

• If r — — 1, the sequence oscillates finitely. 

• If — 1 < r < 1, the sequence converges to 0. 


6 Precise definitions of these concepts are given in the next section. 
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Finite oscillation 

I 


Convergence to 1 

I 


-1 

-rf l 
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Infinite oscillation 
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Convergence to zero 

Divergence to + °o 


Figure 4.2. The story of (r"). 



• If r — 1, the sequence converges to 1. 

• If r > 1, the sequence diverges to +oo. 

We can think of the behaviour of this dynamical system from a different point 
of view. We are varying the parameter r over the whole real line and it divides 
that line into different regions corresponding to how the sequence (r”) behaves 
asymptotically (i.e. for large n). The isolated points r — — 1 and r = 1 are quite 
interesting here (see Figure 4.2) as in each case they are boundaries between two 
very different regions of behaviour, e.g. r = 1 is the boundary between the region 
0 < r < 1 (convergence to zero) and r > 1 (divergence to +oo). The behaviour 
at the boundary is completely different (in each case) from that in either of the 
two regions that it separates and this is not untypical of far more complicated 
systems. 

At this stage it might have occurred to you that we can use limits of 
sequences to make rigorous sense of what we mean by irrational numbers, 
by building these systematically as limits of sequences of rational num- 
bers. For example Jl should be the limit of a sequence that starts off as 
1, 1.4, 1.41, 1.414, 1.4142, 1.41421, 1.414213, 1.4142135, 1.41421356, . . /This is 
indeed the case and the real number line can be given a very concise and 
mathematically satisfying meaning as the completion of the rational numbers 
through taking limits of sequences. The details are sophisticated. We’ll leave them 
for now and come back to this important idea in Chapter 11. 

We’ll finish this section with two useful results. We’ll first show that the limit 
of a sequence of positive numbers can never be negative. 

Theorem 4.1.2. Let (a n ) be a sequence with each a n > 0 and suppose that the 
sequence converges to l. Then l > 0. 

Proof. Assume that l < 0. As ( a n ) converges to l we know that given any 
e > 0 there exists a natural number n 0 such that if n > n 0 then K -l I < e, 
i.e. I — e<a n <l + e (recall (3.6.10) and (3.6.11)). Now e can be any positive 

7 You can extract the first million terms in this sequence by visiting http://antwrp.gsfc.nasa. 
gov/htmltest/gifcity/sqrt2.1 mil 
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number we like so let’s take e = y = — | . Then for n > n 0 we have a n < l — j = j 
which is negative. So we’ve proved that a n is negative for all n > n 0 which is a 
contradiction and so we conclude that l > 0, as required. □ 

Like so many results in analysis, Theorem 4.1.2 is delicate. If you change a n > 0 
to a n > 0 you might at first expect that l > 0 should change to l > 0. This isn’t 
true - for a counter-example (i.e. an example that by its very existence disproves 
the claim) consider a n — f . 

The next result is a useful one about comparing two divergent sequences. We’ll 
omit the proof as it follows almost immediately from the definition and you should 
check this. 

Theorem 4.1.3. Suppose that (a„) and (b n ) are two sequences with a„ < b n for 
all n. 


1. If («„) diverges to +oo then so does (b n ). 

2. If (b n ) diverges to — oo then so does (a„). 


By the way, now we know what limits really mean we can return to a theme we 
discussed at the end of Section 1.2. We were trying to understand the behaviour 
of 7 t(n) - the number of prime numbers less-than or equal to n and I told you 
about the celebrated prime number theorem - but only in a very vague way. Now 
I can state this theorem precisely. It says very succinctly that 


lim *M 1 °g» = 1 , 


so that given any e > 0 there exists a natural number N such that for all n > 
N, 


x(n)log e (n) 


- 1 


< 6 . In Section 1 .2, 1 was misleading you as I asked you to look 
at the behaviour of 7T («) — log "^ and you can now see why this is incorrect. log "^ 
is a good approximation to jr(n) in the sense that ir(n) -H log ” ()|) gets closer and 
closer to 1 as n gets larger and larger. You should test this claim out with some 
numbers. Regrettably, we can’t prove the theorem here as it uses techniques that 
go beyond the scope of this book. 


4.2 Bounded Sequences 


Before we prove any more theorems we need a new concept. We say that a 
sequence («„) is bounded if there exists a real number K > 0 such that 

\a„\ < K for all natural numbers n, 


54 




4.2 BOUNDED SEQUENCES 


i.e. we always have —K < a n < K so that all the terms of the sequence are 
trapped between —K and K. Notice that there is nothing special (at this point) 
about the number K in this definition. Once we’ve found a K that works then any 
larger number will also do the trick. (SI) is an example of a bounded sequence. 
Here we can take any K > 1. (S2) is certainly not bounded. We’ll come back to 
(S2) and (S4) later. We’ve already argued that (SI) is convergent and we’ve just 
pointed out that it is bounded. But there are many examples of bounded sequences 
that are not convergent, e.g. consider the sequence whose nth term is (—1)". We 
know that it diverges but it is clearly bounded (again just take any K > 1). Any 
sequence that fails to be bounded is said to be unbounded. There is an important 
relationship between convergent and bounded sequences which is given in the 
following theorem. 

Theorem 4.2.1. If a sequence ( a n ) is convergent then it is also bounded. 

Proof. We need to find K > 0 such that \a n \ < K for all n. But we know (a„) 
converges to some real number l so given any e > 0 there exists a natural number 
n 0 so that if n > n 0 then \a n — l\ < e. Now by MFT and the triangle inequality, if 
n > n 0 

\a„\ = \(a n -l) + l\ 

< | a n — l\ + \l\ 

< e + \l\- 

So we can take K — e + l provided n > n 0 . We need a K that works for all n so 
now suppose that n < n 0 . We then have 

\a n \ < maxdaj, |a 2 |, ..., |a„ 0 |). 

Now if we combine together the two pieces of our argument we see that a K which 
works for all n is 


K = maxdajl, \a 2 \, |aj, e + |Z|), 

and that completes the proof. □ 

Theorem 4.2.1 tells us that the convergent sequences ‘sit inside’ the bounded 
sequences. 8 If you reverse the logic in the statement of the theorem then you see 
that an unbounded sequence must diverge. This is often the quickest way to see 
that a sequence is divergent and (S3) is a case in point here. 


8 If you know about sets then the set of all convergent sequences is a subset of the set of 
all bounded sequences. See Appendix 2 . 
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We can use the concept of a bounded sequence to make the classification of 
divergent sequences much more precise. We say that a sequence 

• oscillates if it is neither convergent nor properly divergent. 

• oscillates finitely if it oscillates and is bounded. 

• oscillates infinitely if it oscillates and is not bounded. 

Now that we have this definition it should be an easy exercise for you to prove 
thatr" oscillates infinitely when r < —1. Note that a sequence can oscillate finitely 
but still take on infinitely many different values, e.g. consider the sequence whose 
nth term is (— l)”i. The word ‘finitely’ here is really indicating the boundedness 
of the sequence and the fact that there is some finite interval from which it can 
never escape (which is [—1, |] for the last example). 

We’ll finish this section with a useful little result. First a definition. If a sequence 
( a n ) converges to zero then it is called a null sequence. 

Theorem 4.2.2. If (a„) is a null sequence and ( b n ) is bounded then the sequence 
{a n b n ) is also a null sequence. 

Proof. We need to show that (a n b n ) converges to zero. Since ( b n ) is bounded 
there exists K > 0 such that \b n \ < K for all n. On the other hand since (a n ) 
converges to zero, given any e > 0 there exists n 0 such that for all n > n 0 , \a n \ = 
\a n — 0| < |=. But then for such n 

KK\ = Kl-I K\ < ~^- k = € ’ 

and that is what is needed to prove the result. □ 

As an example, we find that (—1)"^ is a null sequence since the sequence (—1)” 
is bounded. If you know about trigonometry, then Theorem 4.2.2 can also be used 
to show that sequences like a n — are null sequences. 

Beware that if (a n ) converges to a non-zero limit, for (b n ) to be bounded is not 
enough to guarantee convergence of (a n b n ), e.g. take a n = 1 — £ which converges 
to 1 and b n = (—1)". In this case a n b n oscillates finitely. 


4.3 The Algebra of Limits 


Let’s return to the sequence (S2) whose nth term is ln n f^f ■ We presented some 
evidence earlier that the limit might be 3. How would we prove this properly? 
First of all we’ll do a little bit of algebra that has nothing to do with analysis. We’ll 
divide every term in the numerator and denominator of the fraction by the highest 
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power of n that occurs. This is n 2 and we get: 

2 n + 3m 2 2 + 3 

a " = m 2 + 4 = ITT 

n z 

Now we know that lim^^ 2 = 0, lim^^ 3 = 3, lim,^^ 1 = 1 and 
lim^oo 2 = 2. We could indeed argue that lim^^ a„ = 3 if we could justify 
writing 


lim a 

n— >oo 


2lim „^oo S + 3 


1 + 4 lim. 


1 lira,. 


i ' 

n 


It turns out that this sort of reasoning is indeed justified and the general result 
that we need is given in the next theorem - which is often known as the algebra of 
limits. 


Theorem 4.3.1. Suppose that (a„) and ( b n ) are convergent sequences with 
linWoo a n = l and lim,,^ b„ = m then 

1. The sequence whose nth term is a n + b n converges to l + m. 

2. The sequence whose nth term is a n b n converges to Im. 

3. If c is any real number then the sequence whose nth term is ca n converges 
to cl. 

4. If b n 0 for all n and also m 0 then the sequence whose nth term is y- 
converges to ^ . 

Proof. 

1. This is fairly similar to that of Theorem 4.1.1 and it goes like this. Given any 
e > 0, we know that there exists a natural number n 0 such that if n > n 0 then 
K-'l<! and there also exists a natural number p 0 such that if n > p 0 then 
\b n — m\ < |.Nowchoosen > max(M 0 ,p 0 ) and apply the triangle inequality 
to see that 

I (fl„ + b n ) - (/ + m ) | = | («„ - /) + (b n - m)\ 

< | a n -l\ + | b n - m\ 
e e 

< - H — = e. 

2 2 

2. Here we’ll use the MFT and the triangle inequality in what I hope is now a 
familiar way - but we’ll also need to appeal to Theorem 4.2.1. First we have 

\a„b n - lm\ = | a n b n - lb„ + lb n - lm\ 

< | b n (a n - l ) | + | l(b„ - m)\ 

= \b n \\ a„ -l\ + \l\\b n - m\, ...(*) 
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where we have used (3.3.5) to get the last line. At this stage we’ll assume that 
l ^ 0 and worry about what happens when l = 0 later on. Now the sequence 
(b n ) is convergent and so by Theorem 4.2. 1 is bounded. Hence there exists a 
real number K > 0 such that \b n \ < K for all n. So we can go back to (*) and 
write 


\a„b„ - lm\ < K\a n - l\ + \l\\b n - m |... .(**) 


Now choose e > 0, then there exists n 0 such that if n > n 0 , \a n — l\ < ^ 
and there exists p 0 such that \b n — m\ < ^ whenever n > p 0 . From (**) we 
then get for n > max(n 0 , p 0 ). 


\a n b n ~ lm\ 



\l\ 


' 2\l\ 



= e. 


That proves the theorem in the case where l ^ 0. If l — 0 then just go 
back to (*) and use the fact that given e > 0 there exists r 0 such that if 
n > r 0 , \a„\ < f . 


3. This follows from the result (2) that we’ve just proved by taking ( b n ) there 
to be the constant sequence whose nth term is the real number c. 


4. This proof is a bit finicky so we’ll do it in stages. First we’ll do the hard part 
and show that lim., . „ . Now this will mean that we’ll have to look 

n^-oo b n m 

at terms like 


1 1 


m — b n 

b n m 


mb n 


1 1 

\b n \ \m\ 




Now for large enough n we can make \b n — m\ as small as we like and y^y 
is constant and so presents no problems. The problem term in (t) is y^— y so 
let’s focus on that. To deal with this we’ll need to be clever and we’ll choose 
e < ly- . As we continue the argument, you’ll see why this is a good idea. 
Given such an e we can as usual find n 0 such that if n > n 0 then 


| b n — m\ < € < 


M 

2 


So by Corollary 3.3.1, we have 


M - \b n \ < 


\m\ 

2 ’ 


and so by (L2), \b n \ > Now we can use (L5) to deduce that 


1 2 
\b n \ < \m\ ' 
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We can then use the same argument as in the proof of Theorem 4.2.1 
to see that the sequence whose nth term is is bounded with K = 

max (iR’ \k\’ •••’ RR r)' 

Now let’s return to (t). With e as chosen we see that for n > n 0 we have 


1 


b 


n 


1 

m 


< 


K 

\m\ 


and that will suffice to establish that lim,^^ y- — ^ . Finally, to show the 
general result claimed in the theorem, we just write P- = a n .f- and use the 
result of (2) that was proved above. □ 


In the proof we’ve just given that lim,^^ ^ we replaced e in the usual 
definition of convergence with R e . This is justified in exactly the same way as we 
argued that e can be replaced by | in the proof of Theorem 4.1.1. You may find it 
helpful to return to the discussion of that point (see also Exercise 4.9). 

Now that we’ve proved Theorem 4.3.1 you should go back to the sequence (S2) 
and convince yourself that every step can be justified to prove that the limit is 3. 


k.k Fibonacci Numbers and the Golden Section 


Let’s return to the Fibonacci sequence (S3). We’ll do two things in this section. First 
we’ll obtain a general formula which allows us to calculate^, for any value of n and 
secondly we’ll calculate the limit of the sequence (S4), i.e. we’ll find lim^^ r n — 
lim^oc Notice that we can’t use algebra of limits here as lim „_><*,/„ = oo 
and ^ has no meaning. From equation (4.1.1) we know that/„ —f n _i +f n - 2 an< i 
we also have the starting points/j = f 2 — 1 . It will make the analysis slightly easier 
below if we also define f 0 = 0. There’s nothing dodgy about this. Indeed we had 
zero rabbits before we started and it allows us to include the case n — 2 in (4.1.1). 
Now I’m going to rewrite the equation (4.1.1) using a different notation: 

gn=gn- l+Sn-2- ( 4 - 4 - 4 ) 

What is the point of this? Well /„ is our notation for Fibonacci numbers. We 
know that f n is a solution of the equation (4.4.4) but there might be other solutions 
that are nothing to do with Fibonacci numbers. Using the notation g (i allows us to 
talk generally about solutions of the equation and then we’ll wind our way back 
to Fibonacci numbers later on. 

Equation (4.4.4) is an example of a difference equation. It’s not difficult to solve 
equations of this kind. We need to find a candidate solution. This is something 
you do by trial and error and it turns out to be sensible to experiment with a trial 
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solution g n = r n where r > 0. At this stage you should be aware that I am not 
saying that r n really is a/the solution to (4.4.4). We’ll just pretend that it is and 
then see what happens. When we substitute r" into (4.1.1) we obtain the equation: 

r n = r” -1 + r n ~ 2 

This looks pleasant but perhaps hard to solve? In fact it’s easy - since r > 0 we 
can divide both sides by the common factor r n ~ 2 to get the quadratic equation 

r 2 = r + 1 


which we can solve by using the famous ‘quadratic equation formula’, 9 to find 
that the equation has two solutions: 


1 ± V5 

r — 

2 


(4.4.5) 


These are the only values of r for which g n — r n solves our equation (4.4.4). 
The largest of these is the golden section khcl which is often denoted by the Greek 
letter 0 (pronounced ‘phi’). You may recall that we briefly mentioned this in our 
list of famous irrational numbers in Section 2.2. The other solution is the negative 
number 'a/ 9 and you can easily check that 


!-</> = -- 
<P 


1 - V5 
2 


It follows that 0" and (1 — 0)" are both solutions of the equation (4.4.4). But 
in what sense do we get the Fibonacci numbers from these? We have to be a little 
more careful. First of all you can check that A0" is also a solution of (4.4.4) - just 
multiply both sides of the equation by A. Similarly B(1 — 0)" is a solution for any 
real number B. Finally we can add these together to see that A0" + B(1 — 0)" is 
also a solution. Although we won’t give a proof here, there are no other solutions 
and we call 


g„=Ap+B( 1-0)”, (4.4.6) 

the general solution. A and B are free to take on any values that we like - but not 
if we want to get the Fibonacci numbers. So from now on we are going to find 
conditions for which g n = /„. In this case we know that/ 0 = 0 and/j = 1 and this 
gives constraints on the values of A and B. To be consistent with the first of these, 
we put n — 0 and then 

0 = A + B, i.e. B = —A. 

For consistency with the second constraint, we put n — 1 in (4.4.6) and get 
1 = A0 - A(1 - 0) = (20 - 1)A, 


9 The solutions to or 2 + hr + c = 0 are r = 


—b±\/b 2 —hac 
2 a 


. In our case a = 1 , b = c = — 1 . 
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and so 

1 1 

A = = — . 

20-1 V5 

This tells us that the formula for the nth Fibonacci number is 

/„=-^[0"-( 1-0)"]- (4-4.7) 

Isn’t that beautiful? The wonderful thing about this formula is that the 
combination of irrational numbers on the right-hand side always produces 
Fibonacci numbers which are natural numbers. You should check this for yourself 
in the cases n — 2,3, 4. 

Now we’ll calculate the limit of the sequence (S4). By (4.4.7) we have 

r = fn+l = r +l - (1 - 0)" +1 
r " f n 0"-(l-0) n 

We can divide top and bottom of this fraction by 0” to get 


Since 0 > 1 1 — 0 1 (why?) we have 


< 1 and so by the results of Example 


4.2, lim n _ ,00 



0. Then by the algebra of limits (Theorem 4.3.1): 


lim r n — <p, 

n— >oo 

and so we’ve proved that the limit of the ratio of successive Fibonacci numbers is 
the golden section - a result that was first suggested by the renowned astronomer 
Johannes Kepler (1571-1630). 

Why is 0 such an important number? It expresses a very natural relationship 
of great beauty which has been much exploited in geometry and architecture. 
Consider a section of a straight line that is divided into two parts and choose units 
so that the length of the smaller part has length 1. Call x the length of the larger 
part so that the section as a whole has length 1 + x, as shown in Figure 4.3. To 
obtain the desired proportion we require x to be such that the ratio of the smaller 


1 * 
Figure 4.3. The golden section. 
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to the larger is precisely equal to the ratio of the larger to the whole, i.e. 

1 x 

x 1 + x 

When we cross-multiply we get the quadratic equation x * 2 3 * 5 6 — x — 1 = 0 and as 
we’ve already seen, the unique solution with x > 0 is x — </>. 


4.5 Exercises For Chapter 4 


1 Consider the sequence a n = 1 + Can you guess the limit /of this sequence 7 

(a) Verify thatyourguess is feasible by findinga natural number n 0 , for each of 
the following given values of e such that/? > n 0 => \a n - l\ < e 

(i) e = 0.1 (ii) e = 0.01 (iii) e = 0.001 (iv) 6 = 0.0001 (v)6 = 10~ 10 

(b) Give a rigorous proof that a n converges to /. 


2 Use the definition of convergence to find the limits of the sequences whose nth 
terms are as follows 

13 1 1 

(a) 1 - - (b) - (c) (of) — . 
n n n l Jn 

In each case you should proceed by finding the value of n 0 which ensures that 
‘closeness' is satisfied for any given e 

3 . Write down a formula for the general term of a sequence (a n ) so that 
o, , o 2 , o 3 , a k and o 5 are precisely 1 . 7. | and use the definition of limit 

to show that the sequence converges to 

4 Show that if (x n ) converges to x then (|x„|) converges to |x|. Is the converse 
true? If so, give a proof and if not, present a counter-example. 


5 . If a n ->• 0 as n ->■ 00 and 0 < b n < a n for all n, show that lim^^ b n = 0 . 


6. Use the algebra of limits to find the limits of the following sequences: 


(a) 

(d) 


1 

2 - - 
n 


1 

3 H — 
n 


n 2 + 1 
2 n 2 — n + 2 


(b) 

(e) Vn + 1 - s/n 


i + 7s) 


(C) 


2n + 3 
5 n + 9 


7 . The following were all written down in an examination in answer to the question, 
'What is the definition of a sequence (x„) converging to a limit x?’ Say what is 
wrong, if anything, with each of them, (a) For some e > 0 there is an N such 
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that \x n —x\ < e for n > N. (b) Where e > 0, for some natural number N where 
n > N, \x n - x\ < e (c) For every positive number e there is a term in the 
sequence after which all the following terms are within e ofx. (d) For any e > 0 
there is some n > A/ such that |x n — x\ < e. (e) For some e > 0 there is a natural 
number N < n such that \x n — x| < e for all n past a certain point. 

8. The purpose of this question is to show that the order of the words in the 
definition of convergence is critical. A sequence (x n ) is defined to be ridiculously 
convergent tox (this is just made up for this question) if there exists a natural 
number N such that for every r > Owe have |x n — x| < e whenever n > N. 

(a) Comment on the difference between 'ridiculous convergence’ and ‘conver- 
gence’ (in the usual sense). 

(b) Show that the sequence (1) is not ridiculously-convergent to 0. 

9. Suppose that the sequence (x n ) converges tox. Let C > 0 be a fixed positive 
constant. Show that for any e > 0 there is a natural number N such that 
|x n — x| < Ce whenever n > N. 

1 0. Show that if (x n ) is a sequence converging to x and that x n < a for all n then 
x < a. [Hint: Use the result of Theorem 4.1 .2.] 

1 1 . The 'sandwich rule’ says that if (o n ), (b n ) and (c n )are three sequences for which 
a n < b n < c n for all n and where ( a n ) and (c n ) both converge to the same 
limit I, then (b n ) also converges to /. Prove this result. [Hint: Use the result of 
Exercise 4.5.] 

12. Use the sandwich rule to find the limit of the sequence whose nth term is 

n — cos[n) 
n 

1 3. Consider a positive sequence (x n ), i.e. one for which each x n > 0, and assume 
that the sequence converges to a positive limit. Show that lim^^ ^ti = 1 . 
Give examples, one in each case, of a convergent positive sequence ( x n ) for 
which the sequence whose nth term is X ^ L (i) converges to zero, (ii) converges 
to a half, (iii) diverges (trickier). 

14. (a) Let r > 1 and consider the sequence (m). Prove that it converges to 1 . 

i 

[Hint: Write m = 1 + c n where c n > 0 and use Bernoulli's inequality from 
Exercise 3.7 to show that lim^^ c n = 0.] 

(b) Show that lim^^ rr = 1 when 0 < r < 1 . [Hint: Write r = 1] 

l ^4 

(c) Prove that lim^^ nn = 1 . [Hint: Write ~Jn n = 1 + c„.[ 

1 5. This question deals with the important notion of subsequence. Let ( a n ) be an 
arbitrary sequence. To get a subsequence of ( a n ) we first take any increasing 
sequence of natural numbers: n^ < n 2 < n 3 < • • • < n r < ■ ■ ■ We then form the 
sequence b r = (o nr ), so this sequence begins o ni , a n2 , o nr Then ( b r ) is called 
a subsequence of (a n ), e.g. we obtain a subsequence of ( \ ) by taking every fifth 
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term to get 5 , ^ , . . . Now suppose that (a n ) converges to a limit /. Show 

that every subsequence of (a n ) also converges to /. 

1 6. Find two convergent subsequences of the sequence whose nth term is (— 1 ) n . 
[In the exercises for Chapter 5, we will prove the Bolzano-Weierstrass theorem 
which states that every bounded sequence has at least one convergent 
subsequence] 
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Bounds for Glory 


Among the small there is no smallest, but always something smaller. 

Anaxagoras quoted in Philosophy oh Mathematics and Natural 
Science , Hermann Weyl 


5.1 Bounded Sequences Revisited 


T he limit is the hero of this book. But no great hero can achieve their destiny 
alone. They need companions who can help them along the way. So it is with 
the limit. In this chapter we’ll meet its two helpers - the twin concepts of the ‘sup’ 
and the ‘inf. 

We’ve already spent some time with bounded sequences in Section 4.2 and 
we’ve seen that every convergent sequence is bounded but that bounded sequences 
may not necessarily be convergent. In this chapter, we’ll go into the subject of 
bounded sequences more deeply. To start us off let’s consider the sequence ( a n ) 
whose nth term is a n = f + (—1)"; so it is the sum of two sequences, one of which 
is convergent while the other diverges. The sequence begins 


3 2 5 4 7 6 

0 , -, — , -, — , -, — 

2 3 4 5 6 7 

There is a pattern here and you can see that the even terms are always of the form 
a 2n = ^y-1, while the odd ones can be written a 2n _ 1 = • The sequence doesn’t 

converge, indeed it oscillates finitely. In fact it never gets larger in magnitude than 
the second term |. To see this we observe that 


a 


2 n 


2 n+ 1 
2 n 


= 1 + 


1 

2 n 


< 1 


3 

2 ’ 


and |fl 2n _j| 


2m -2 3 

< 1 < 

2m - 1 ~ 2 
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so we have \a n \ < | for all n, and so — § < a n < |. In fact we can refine these 
bounds. Notice that we also have 

2 — 2 n I — 2 n 

, = > = — 1 , 

2 " _1 2 n - 1 2n - 1 

so we obtain 

3 

— 1 < a n < 

” 2 

We call \ an upper bound for the sequence ( a n ) and —la lower bound. 
Upper and lower bounds give us additional tools for describing the way in which 
sequences behave. 

More generally we say that an arbitrary sequence (a n ) is bounded above if there 
exists a real number L such that a n < L for all n and L is then called an upper 
bound for the sequence. A sequence is bounded below if there exists a real number 
M such that a n > M for all n and M is called a lower bound. 

Three points to note. 

1. Upper and lower bounds are not unique. For example if we return to the 
sequence with a n — \ + (— 1)" then e.g. 5 is also an upper bound and —21.6 
is a lower bound. 

2. A sequence may be bounded above but fail to be bounded below and vice 
versa. For example the sequence of natural numbers with a n — n is bounded 
below but not bounded above. Similarly the sequence a n — — n is bounded 
above but not bounded below. 

3. A sequence is bounded if and only if it is both bounded above and bounded 
below (see Exercise 5.1). 

As upper and lower bounds are not unique we might enquire which of these is 
the ‘best’ for a given sequence. By ‘best’ here we mean the smallest (mathematicians 
prefer to say ‘least’) upper bound for a sequence that is bounded above and the 
greatest lower bound for one that is bounded below. If the sequence only took 
a finite number of distinct values then we would be looking for the maximum 
and minimum (respectively) as described in Section 3.4. When a sequence takes 
an infinite number of different values there is no reason why the maximum and 
minimum should exist. For example consider the sequence with general term 
a n —K Later on we’ll prove that its greatest lower bound is 0 but we’ve already 
pointed out that there is no number N for which T — q, so 0 cannot be the 
minimum. 

It’s a deep and subtle property of the real numbers that any sequence that is 
bounded above has a least upper bound and (consequently, as will follow from 
Theorem 5.1.2 - see Exercise 5.5) any sequence that is bounded below has a 
greatest lower bound. For now we’ll just assume this but we will come back to 
look at this result in greater detail in Chapter 11. 
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Now some notation. In old books on analysis the greatest lower bound and 
least upper bound often used to be denoted g.l.b. and l.u.b. (respectively), but for 
a long time now mathematicians have preferred a different terminology which 
has its root in Latin. So the greatest lower bound is referred to as the infimum and 
shortened to inf while the least upper bound is called the supremum and denoted 
by sup. These come from the same Latin root as ‘inferior’ and ‘superior’. 1 

Now it’s time for a formal definition. Suppose that (a„) is a sequence that is 
bounded above. We define its supremum sup(a„) to be the unique real number 
that satisfies 


a„ < sup(a„) < L, 

for all n where L is any upper bound for (a„). Similarly if ( a n ) is a sequence that 
is bounded below, its infimum, inf(a„), is defined to be the unique real number 
that satisfies 


K < inf (a„) < a, v 

for all n where K is any lower bound for (a„). 

If there exists a natural number N such that sup(a„) = a N we say that the 
supremum is attained. A similar definition holds for the infimum. 

As an example, we see that sup (i) = 1 as - < 1 for all n > 1. As a l = 1 we 
see that the supremum is attained in this case. On the other hand 0 is a lower 
bound and suppose that a is a larger lower bound. Then 0 < a < ^ for all n and 
so n < ~ f° r all n which is impossible, so we have a contradiction and can assert 
that inf ( i) = 0. So in this case the infimum is not attained (but it does coincide 
with the limit of the sequence). 

We can see from this definition that sup coincides with max and inf is precisely 
min if ( a n ) has only finitely many values. But on the other hand, these are more 
subtle concepts that are not so far removed from limits as the following theorem 
shows. 

Theorem 5.1.1. 

1. If the sequence ( a n ) is bounded above then given any e > 0 there exists a 
natural number n such that 

a n > sup(a„) - e. (5.1.1) 

2. If the sequence ( a n ) is bounded below then given any e > 0 there exists a 
natural number n such that 

a n < inf(a„) + e. (5.1.2) 


1 Try typing supremum and infimum into http://ablemedia.com/ctcweb/showcase/ 
wordsonline.html 


67 



5 BOUNDS FOR GLORY 


Proof. We’ll only prove (1) as the proof for (2) is similar and we’ll employ a 
proof by contradiction. So suppose the result is false and no such n exists. Then 
a n < sup(a n ) — e for all n. But then sup(a„) — e is an upper bound for (a„) that is 
smaller than sup(a„). But sup(a„) is the smallest upper bound (by definition) and 
we have our contradiction. □ 

Although I said that Theorem 5.1.1 has some similarities with the definition of 
the limit, I hope you’ll see that it presents much weaker statements. For example, 
the first of these tells us that as we go along the sequence we must be able to find a 
sufficiently large n such that a n gets arbitrarily close to sup(a n ), but the direction of 
closeness is only from one side and we say nothing about what happens for larger 
values of n. Later in this chapter we will see that sups and infs can sometimes 
be limits but we will need to impose more structure on the type of sequence we 
consider. 

There is a natural symmetry between the concepts of sup and inf - indeed to 
get from one to the other we just replace ‘greatest’ by ‘least’ and ‘upper’ by ‘lower’. 
In fact, there is a sense in which we only need one of these concepts - inf is really 
a sup in disguise as the following theorem proves. 

Theorem 5.1.2. If ( a n ) is a bounded sequence then 

inf(a„) = — sup(— a„). 


Proof. As — a n < sup(— a n ) for all n, by (L4) — sup(— O < for all n and so 
— sup(— a n ) is a lower bound for our sequence. To show that it is really the inf 
we’ll assume that it isn’t and try to obtain a contradiction. So assume that there 
exists a real number ft such that 

- sup(-a„) < f < a n , 

for all n. But using (L4) again we get 

~a„ <~f< sup(-a„), 

and so — f is a smaller upper bound for the sequence (— a n ) than its own 
supremum - which is the contradiction we were looking for. □ 

This is fairly typical of how results about sups and infs are proved. If you’ve 
guessed that a number a might be sup(a n ) you should firstly show that a really 
is an upper bound for the sequence and secondly assume that it isn’t the sup and 
try to find a contradiction. The next result we’ll prove is a fairly simple one but it 
is quite useful. 

Lemma 5.1. 

1. If the sequence (a„) is bounded and is nonnegative, i.e. each a n > 0 then 

sup(a„) > inf(a„) > 0. 
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2. If ( b n ) and (c„) are bounded sequences with b n > c n for all n then 
inf(&„) > inf(c„) and sup(h„) > sup(c„). 


Proof. 

1. ClearlyOisalowerboundfor(a n )sowemusthaveinf(a n ) > O.Butsup(a n ) > 
inf(a„) by definition and that’s all we need. 

2. We have inf (c„) < c n < h )1 forall«andsoinf(c„)isalowerboundfor(h„).In 

this case it cannot exceed the greatest lower bound and so inf (c n ) < inf (b n ). 
The result for the sup is proved similarly. □ 

Suppose that the sequences (a„) and (b n ) are both bounded above then so is the 
sequence ( a n + b n ). Indeed if K is an upper bound for (a n ) and! is an upper bound 
for ( b n ) you can check easily that K + L is an upper bound for (a n + b n ). You may 
then guess that we might have ‘sup(a„ + b n ) — sup(a„) + sup (b n )\ As is so often 
the case in this subject, we have to be more careful. Equality doesn’t hold as we 
can see by taking a n — 1 + ^ and b n = — f Then sup(a„ + b„ ) = 1, sup(a„) = 2 
and sup (b n ) — 0. What can be proved is a weaker but useful result. 

Theorem 5.1.3. If (a„) and (b n ) are bounded above then 
sup(a„ + b„) < sup(a„) + sup(fc„). 

Proof. Since a n < sup(a„) and b„ < sup (b n ) we have 
a n + b n < sup(a„) + sup(b„), 

for all n. So sup(a n ) + sup(hj is an upper bound for {a n + b n ). Then it cannot be 
smaller than the least upper bound and that gives our result. □ 

You might expect that a similar result to Theorem 5.1.3 holds for the inf. It 
does, but you have to be careful as it goes the other way around. The result is that 
if ( a n ) and (b n ) are both bounded below then so is {a n + b n ) and 

inf(fl„) + inf(&„) > inf(a„ + b n ). 

You can prove this for yourself either by imitating the proof of Theorem 5.1.3 or 
by combining the result of that theorem with that of Theorem 5.1.2 to turn infs 
into sups (see Exercise 5.4). 


5.2 Monotone Sequences 


In this section we’ll focus on the question I posed before - when can a sup or an 
inf be a genuine limit? The answer to this is when a sequence is a monotone one. 
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To be precise we will say that a sequence (a n ) is monotonic increasing if a n+l > a n 
for all n and is monotonic decreasing if a n+l < a n for all n. Finally we say that 
a sequence is monotone if it is either monotonic increasing or decreasing. An 
example of a monotonic decreasing sequence is a n — f while a n — 1 — ^ is 
monotonic increasing. To prove the first is straightforward. To prove the second 
you can either show that we always have a n+l — a n > 0 by doing some algebra, or 
use the (fairly obvious) fact that a sequence (a„) is monotonic increasing if and 
only if (— a n ) is monotonic decreasing. 

Now it certainly isn’t true that every monotone sequence converges, e.g. think 
of a n = n. But suppose a sequence is monotonic increasing and bounded above. 
Then on the one hand we are told that our sequence is steadily increasing in value, 
but on the other hand, we have imposed a ceiling on it that it cannot exceed. So 
where can it go to except to the ceiling? The next result puts this intuition into 
precise mathematical form. 

Theorem 5.2.1. 

1. If the sequence (a n ) is bounded above and monotonic increasing then it is 
convergent and lim,,^^ a n — sup(a„). 

2. If the sequence ( a n ) is bounded below and monotonic decreasing then it is 
convergent and lim,,^^ a n — inf(a„). 

Proof. 

1. Since (a n ) is bounded above we know that a = sup(aj exists. We also 
know from Theorem 5.1.1 that given any e > 0 there exists n 0 such that 
a no > a — e. But ( a n ) is monotonic increasing so for all n > n 0 we have 
a n > a„ 0+1 — a n 0 > a — e. This tells us by simple algebra that 

a — a n < e, for all n > n 0 . 

But a > a n for all n and so \a — a n \ = a — a n . Then we’ve satisfied the 
conditions for convergence as described in the definition of the concept and 
can assert that lim^^ a n — a. 

2. This is a good opportunity to test your understanding by doing it yourself. 

There are two approaches. The first is to imitate the proof we’ve just given 
by using the second part of Theorem 5.1.1. The second approach which is 
perhaps a little slicker is to derive the result as a corollary to (1) by using the 
fact that a sequence ( a n ) is monotonic decreasing and bounded below if and 
only if the sequence (— a n ) is monotonic increasing and bounded above and 
then applying (1) and Theorem 5.1.2. □ 

You might think that the next result is too obvious to need a proof - but I hope 
you’ve seen enough by now to appreciate that the obvious isn’t always true. In 
mathematics, everything must be proved logically. 
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5.3 AN OLD FRIEND RETURNS 


Corollary 5.2.1. 

1. If the sequence ( a n ) is monotonic increasing then either it converges or it 
diverges to +oo. 

2. If the sequence ( a n ) is monotonic decreasing then either it converges or it 
diverges to — oo. 

Proof. We’ll only prove (2) as (1) is so similar. Suppose that (a n ) is monotonic 
decreasing. Then either it is bounded below or it isn’t. If it is bounded below 
then it converges by Theorem 5.2.1 (2). If it isn’t bounded below then given any 
K < 0 we can find a natural number n 0 such that a nQ < K for otherwise K would 
be a lower bound. But then since the sequence is monotonic decreasing we have 
a n < K for all n > n 0 and so the sequence diverges to — oo. □ 


5.3 An Old Friend Returns 


To get a feel for how to use Theorem 5.2.1 we should do an example. 2 We’ll 
construct a sequence (a„) by recursion so that a n+l is not given explicitly by a 
known formula but implicitly through the value of a n . This doesn’t work unless 
we have a starting point and so we define (a„) by: 


a 1 — 1 and a n+1 = ^/l + a n for n > 1. 
Let’s calculate the first few terms. We have 


a 2 

a 3 

#4 

a 5 


Vl + 1 

V 1 + \/2 


\J 1 + \J 1 + \/2 



1.4142136. . . 

1.553774. . . 

1.5980532. . . 

1.6118478. . . 

1.6161212. . . 


(5.3.3) 


It certainly looks like (a„) is increasing and bounded above. How do we prove 
this? Let’s look at the bounded problem first. 

Bounded. From the calculations we’ve done it certainly looks like 2 will be an 
upper bound. There’s no good reason why it should be the sup but finding that 
isn’t our concern . . . yet. Let’s use a proof by contradiction and suppose that there 


2 You may find it helpful to attempt Exercise 5.6 before you read the rest of this section. 
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exists a number N such that a n < 2, for all 1 < n < N, but a N+l > 2. From our 
calculations we know that if it exists then N > 5. Now by (5.3.3), ^/1 + a N > 2 
and squaring this (remember (L6)) yields 


1 + a w > 4, 

i.e. a N > 3. That’s a contradiction and so we can assert that our sequence really is 
bounded above. 

Monotone. Squaring the general recursive formula in (5.3.3) we get for all n> 1, 

a l+ i = 1 + a„, 

and for all n > 2, 

a 2 n = \ + a n _ v 

Subtracting the second equation from the first yields 

a n + 1 a n — a n~ a n- 1> 
i.e. ( a n +l “h a n)( a n+ 1 — a n ) ~ a n ~ a n- 1> 

and so, 3 


-*n+i 


a„ — 


*B+1 


Working backwards we get 


dy. 1 Cl 


dy, dy. 


n—2 


and continuing in this manner we eventually get to 


$2 Cl j 

Cl ^ — • 

^3 “I” ^2 

Combining all of these together we find that 


$2 ^ 1 

" +1 " (a„ + i + «„)(«„ + «„_i) ••• («3 + a a) 

V2 - 1 

= > 0 , 

( a n + 1 + a n)\ a n + a n-l) ' " ( a 3 + fl l) 

as > 1 and the bottom line of the fraction is a positive number. This shows 
that a n+1 > a n for all n and so («„) is monotonic increasing as we wanted. 


3 You can use a similar argument to the one we have given to prove that the sequence is 
bounded above to show a n > 1 tor all n and so the division below is justified. 
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5.4 FINDING SQUARE ROOTS 


Limit. As the sequence is bounded above and monotonic increasing we know 
that it converges by Theorem 5.3.3. Let l — sup(a„) = bm,,^^ a„. To find l we’ll 
first square both sides of (5.3.3) to get 


and then take limits of both sides 

lim a ] ‘ +1 = lim (1 + « ). 

n— >oo n— >oo 

Now apply the algebra of limits (Theorem 4.3.1) and we obtain a quadratic 
equation in l: 

l 2 = 1 + 1, 
i.e. Z 2 — Z — 1 = 0. 

We’ve met this equation before in Section 4.4 where we learned that it has two 
solutions - the golden section </> and 1 — <j>. In our case, since the limit is the sup 
and every term of the sequence is a positive number, we must have l > 0 and so 
l — 4>. So we find the golden section appearing in another guise - as the limit of 
the sequence defined by (5.3.3). 


5.4 Finding Square Roots 


We’ve already pointed out that one of our goals is to be able to find (at least in 
principle) any irrational number as the limit of a sequence of rationals. In the 
last section we saw how to do this for the golden section. We are pretty far from 
being able to do this in general, but we can at least look at the square roots of 
prime numbers which we showed were irrational back in Chapter 2. Our aim is to 
find y/p where p is any prime number and to do this we again set up a recursive 
sequence which this time is defined by 


a\>p and 



for n > 1, 


(5.4.4) 


where I also insist that a 1 is a rational number. It seems strange to define a 1 by 
an inequality but here all I am saying is that I have perfect freedom to choose a x 
however I please, provided that its square exceeds the number p whose square 
root we seek. So for example if p = 5 we could take a x = 2.5 or a x — 3 but not 
a x — 2. If this bothers you, then there is no harm done in just choosing a x = p. 
I will return to this discussion of ‘starting points’ at the very end of the section. 
Note that each a n is rational. For suppose that a 1? a 2 , . . . , ct N _ x are rational but 
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a N isn’t. We know N > 1 as a 1 is rational. Now as a N-1 is rational we can write 
a N _! = - where x and y are natural numbers. Then 


— t I — I 


2 Vy 


yp 


is also a rational number and that’s the contradiction we were looking for. In the 
argument that I just gave, I assumed that a n was always a positive number. This 
really should have been proved first but I’ll leave that step to the reader. It isn’t 
hard and works by a similar argument to the one I’ve just given. 

Before we go any further, let’s look at a special case. We’ll take p — 5. 
Then systematically applying (5.4.4) (and choosing a, = 5) we find a 2 = 3, 
a 3 = 2.33333 . . . , x 4 = 2.2380952 . . . , x 5 = 2.2360689 ...,x 6 = 2.236068. I’ll 
stop here as after five uses of (5.4.4) or iterations, we have found V5 correct to six 
decimal places. This is pretty impressive. 

Returning to the general case, we’d like to show that lim^^ a n — v /p. The 
obvious strategy is to use Theorem 5.2.1 by showing that the sequence («„) is 
bounded below and monotonic decreasing. 

Now notice that if we know that ( a n ) converges then it’s limit really is Jp. To 
see that write a — lim^^ a n and use algebra of limits in (5.4.4) to deduce that 


a = 



and a little algebra yields a 2 — p and so a = Jp (we can’t have a = — Jp by 
Theorem 4.1.2). 

Now in Exercise 3.10 you should have verified the inequality 

p -i( r + r) ~ yl (5A5) 

whenever y 2 > p. Since a 2 > p (and now you know why I insisted on this) if we 
put y = flj then we see from (5.4.5) that we also have a\ > p. Then use (5.4.5) 
again with y = a 2 to get a 3 > p. Now suppose that a 2 > p for all 1, 2, . . . , n but 

that a 2 +1 < p. Then (5.4.5) yields p < \ ^ a n + = a 2 n+l and we have deduced 

a contradiction. Hence we see that we must have a 2 > p for all n and so the 
sequence (a n ) is bounded below and Jp is a lower bound. 

We’ll now prove that ( a n ) is monotonic decreasing. We compute 


ci„ — u 


n + 1 


— 


2a„ 



p 


1 

2 a „ 


(al -p)> 0, 


as we have just shown that a 2 > p for all n. 
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5.5 EXERCISES FOR CHAPTER 5 


The method we’ve just used for finding square roots of primes can just as easily 
be used to find the square root of any positive real number. There is another quite 
straightforward way to obtain square roots without using Theorem 5.2.1 and this 
is called the ‘Newton-Raphson method’. 4 It relies on calculus and so is outside the 
scope of this book. In fact when we use this, we can forget about trying to make 
the sequence monotonic decreasing from the outset which I did here by making 
the choice a l > p. You get quicker convergence if you start at a point which is 
a rough guess at *Jp so for the case p = 5, since 2 2 = 4 and 3 2 = 9, you might 
choose a 1 = 2.2 or 2.3. 


5.5 Exercises for Chapter 5 


1. Prove thata sequence (a n ) is bounded (i.e. there exists K > 0 such that \a n \ <K 
for all n) if and only if it is bounded above and bounded below. 

2. For each of the following sequences say whether it is (i) bounded above or 
below, (ii) monotone increasing or decreasing, (iii) convergent. In the case of 
(i), write down an upper or lower bound (as appropriate) and try to guess the 
supremum and/or infimum (or even better, establish these by a proof) and in 
(iii), write down the limit when this exists. 

(a) 3- | (b) — ] -j- (c) " d) ( _1 )"“ (e)H1-(-1) n ) 

' n + 1 1 n+1 

3. If (a n ) and [b n ] are sequences which are bounded above, show that 

(a) 


sup(a n fc> n ) < sup(a n )sup(fc> n ), 
whenever each a n ,b n > 0. 

Give a counter-example to show that equality does not hold in general 
Show further that 
(b) 


sup(|o n + b n |) < sup(|o n |) + sup(|bj), 


(c) 


|sup(aj < sup(|a„|). 


4. Suppose that the sequences (a n ) and (b n ) are bounded below. Deduce that 
inf(o„) + inf(t> n ) > inf(o n + b n ). 


h See e.g. http://en.wikipedia.org/wiki/Newton’s_method. If you know what the method 
is then to find square roots of primes you need to apply it to f(x) = x 2 — p. 
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Also find and prove an analogue to part (a) of the previous question. Give 
counter-examples to show that equality doesn't always hold in both cases. 

5. Assume the completeness property of the real numbers (see Chapter 1 1 ), i.e. 
that every sequence (x n ) that is bounded above has a least upper bound. Prove 
that every sequence that is bounded below has a greatest lower bound. [Hint: 
Use Theorem 5.1.2.] 

6. Letx, = 2.5 andx n+1 = l(x^ + 6) forn > 1 . 

(a) Show that each 2 < x n < 3. (Hint: Try a proof by contradiction.) 

(b) Show thatx n+I —x n = l(x n - 2)(x„ - 3). 

(c) Show that the sequence (x n ) is monotone and find its limit as n -» oo. 

7. Let a > b > 0. We define sequences (a n ) and ( b n ) by taking cq and tq to 
be a and b respectively, and requiring that for n > 1 a n+1 = j(a n + b n ) and 
b n+ 1 = Ja n b n . In other words, o n+1 is the arithmetic mean of a n and b n while 
fc> n+1 is their geometric mean. 

(a) Prove that b n < b n+1 < o n+1 < a n for each n. 

(b) Prove that a n+] - b n+1 < \(a n - b n ) for all n. 

(c) Deduce that the sequences ( a n ) and (b n ) are each convergent and that 
they converge to the same limit. (The common limit M(a; b) = lim n ^. a n = 
lim^^ b n is called the arithmetic-geometric mean of a and b. It can be 
given a precise form using objects called elliptic integrals.) 

8. Show that if ( a n ) is a sequence that is both monotonic increasing and also 
convergent to a limit / as n ->■ oo, then ( a n ) is bounded above and / = sup(o n ). 
What happens when (a n ) is monotonic decreasing and convergent? 

9. The purpose of this question is to prove that nPx n ->■ 0 as n -*■ oo for any real 
number p and for any — 1 < x < 1 . Assume firstly that 0 < x < 1 , and write 
a n = nPx n .(a) Show that lim^^ C7 ^± I = x. (b) Deduce that C7 j?± i is eventually 
less than one and so (a n ) is eventually decreasing. [Here 'eventually' means 
there is some N such that the statement is true for all n > N .] (c) Deduce that 
the sequence (a n ) tends to a nonnegative limit /. (d) Use part (a) with Exercise 
4.1 3 to deduce that / = 0. What about the case where —1 < x < 0? 

1 0. Suppose that (a n ) is a monotonic increasing sequence that has a subsequence 
( a nk ) which converges to a limit /. 

(a) Show that a n < / for all n. 

(b) Show that (a n ) converges to /. 

(c) What happens when 'increasing' is replaced by 'decreasing' in this question? 

11. As promised at the end of Exercise 4.16 we will now prove the celebrated 
Bolzano-Weierstrass theorem which states that every bounded sequence has 
at least one convergent subsequence. Let (x n ) be a sequence for which a < 
x n < b for all n. 
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Define cq = a and fcq = b. Let q = ^(fcq — o, ). Either an infinite number of 
terms of the sequence (x n ) lie in the interval [cq , c] or they lie in [c, b] (or they 
lie in both). Suppose for the sake of argument that they lie in [cq , c]. Define 
o 2 = cq and b 2 = q . Define c 2 = 2 (t> 2 - a 2 ) and repeat the argument just 
given. In fact repeat this exercise indefinitely to generate two sequences of 
numbers (a n ) and (fc> n ) where 

a ! < o 2 < • • • < a n < • • • < b n < • • • b 2 < fcq . 

(a) Deduce that b n - a n = b - a) for all n. 

(b) Use Theorem 5.2.1 to show that (a n ) and (b n ) both converge. Use the result 
of (a) to verify that lim^^ a n = lim^^ b n . 

(c) Explain how you may extract a subsequence (x nr ) of (x n ) for which a r < 
x nr < b r for each r. Hence show that [x nr ) converges. 

12. Let (x n ) be a bounded sequence and define two associated sequences as 
follows 


a n = sup(x m , m > n) and b n = inf(x m , m>n) 

(a) Show that (a n ) is monotonic decreasing, bounded below and hence 
convergent. 

(b) Show that (b n ) is monotonic increasing and bounded above and hence 
convergent. 

We define 


limsupx„= lim a n , 

n^oo n ~*°° 

liminfx„= lim b n . 

n — >oo n — >-oo 

Find lim sup and lim inf of the following sequences: 

<i) (-i) n , (ii) - n (iii) (-mi 

Note: lim sup and lim inf play a major role in advanced analysis. An important 
theorem states that a bounded sequence (x n ) converges to the limit / if and 
only if 

lim supx n = lim infx„ = /. 

n-+ oo n ^°° 

You may encounter some books in which lim sup^^ x n is written lim^^x^, 
and lim inf^^x,, is written lim„. ^^x„. 
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You Cannot be 


The limiting process was victorious. For the limit is an indispensable concept, 
whose importance is not affected by the acceptance or rejection of the 
infinitely small. 

Philosophy of Mathematics and Natural Science, Hermann Weyl 


6.1 What are Series? 


S equences are the fundamental objects in the study of limits. In this chapter we 
will meet a very special type of sequence whose limit (when it exists) is the best 
meaning we can give to the intuitive idea of an ‘infinite sum of numbers’. Let’s be 
specific. Suppose we are given a sequence (aj. Our primary focus in this chapter 
will not be on the sequence ( a n ) but on a related sequence which we will call ( s n ). 
It is defined as follows: 


s i — Oj 

5 2 = + fl 2 

53 — £?2 “I” ^35 


and more generally 


S n — a l + H + fl„_! + a n . 

The sequence ( s n ) is called a series (or infinite series) and the term s n is often 
called the nth partial sum of the series. The goal of this chapter will be to investigate 
when the series (sj has a limit. In this way we can try to give meaning to ‘infinite 
sums’ which we might write informally as 


1 + 


1 

2 


1 

3 


_ + ... 


( 6 . 1 . 1 ) 
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1 — 1 + 1 — 1 + 1 — 1H , (6.1.2) 

but beware that the meaning that we’ll eventually give to expressions like this 
(in those cases where it is indeed possible) will not be the literal one of an 
infinite number of additions (what can that mean?), but as the limit of a 
sequence. Sequences that arise in this way turn up throughout mathematics and 
its applications so understanding them is very important. 


6.2 The Sigma Notation 


Before we can start taking limits of series we need to develop some useful notation 
for finite sums that will simplify our approach to expressions like (6.1.1) and 
(6.1. 2). 1 

Let’s suppose that we are given the following ten whole numbers: a x = 3, a 2 = 
—7, a 3 = 5, a 4 = 16, a 5 = —1, a 6 = 3, a 7 — 10, a 8 = 14, a 9 — —2, a 10 — 0. We 
can easily calculate their sum 

rq + u 2 + * * * + t+o — 41, (6.2.3) 

but there is a more compact way of writing the left-hand side which is widely 
used by mathematicians and those who apply mathematics. It is called the ‘sigma 
notation’ because it utilises the Greek letter E (pronounced ‘sigma’) which is 
capital S in English (and S should be thought of here as standing for ‘sum’). Using 
this notation we write the left-hand side of (6.2.3) as 

10 

i= i 

Now if you haven’t seen it before, this may appear to be a complex piece of 
symbolism - but don’t despair. We’ll unpick it slowly and we’ll read bottom up. 
The i — 1 tells us that the first term in our addition is a v then the E comes into 
play and tells us that we must add a 2 to a x and then a 3 to a 1 + a 2 and then a 4 
to a 1 + a 2 + a 3 and then ... but when do we stop? Well go to the top of E and 
read the number 10. That tells you that a w is the last number you should add, and 
that’s it. The notation is very flexible. For example you also have 
8 

^ ' + = rq + u 2 + • • • + cig — 43, 

i=i 

9 

^ ^ ~ ^2 "f ^ 3 + ’ ' ' + ^9 “ 38, 

i=2 

1 Readers who already know about this notation may want to omit this section. 
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7 

^ ' Uj — Ufr — 13. 

i=6 

The following results are easily derived and quite useful. We will use them 
without further comment in the sequel. If a x , . . . , a n , b 1 , . . . b n and c are arbitrary 
real numbers then 

n n n 

J2( a i+ b i) = J2 a ‘ + J2 b i' 

i=i i=i i=i 

n n 

Y ca ‘ = c J2 a f 

i= i <=i 

The sigma notation is particularly useful when we have a sequence (a„) and we 
want to add as far as the general term a n to obtain the nth partial sum s n . We can 
now write 

n 

S n = J2 a r 
i=l 

Notice that the letter i here is only playing the role of a marker in telling us 
where to start and stop adding. It is called a ‘dummy index’ and the value of s„ is 
unchanged if it is replaced by a different letter throughout - e.g. 

n n 

a i = y, a r 
;=i ;= 1 

It’s worth pointing out one special case that often confuses students. Suppose that 
a t = k takes the same value for all i. Then 

n n 

Y, a,- = k — k + k + ■ ■ ■ + k — nk. 

l ~ 1 l ~ ^ n times 

Our main concern in this chapter is with infinite rather than finite series, but 
before we return to the main topic let’s look at an interesting problem that (and 
this may be an apocryphal story) was given to one of the greatest mathematicians 
the world has even seen, Carl Friedrich Gauss (1777-1855), 2 when he was a 
schoolboy. The story goes that the teacher wanted to concentrate on some urgent 
task and so he asked the whole class to calculate the sum of the first 100 natural 
numbers so that he could work in peace while they struggled with this fiendish 
problem. Apparently after 5 minutes Gauss produced the correct answer 5050. 
How did he get this? It is speculated that he noticed the following clever pattern 



2 See e.g. http://202.38-1 26. 65/navigate/math/history/Mathematicians/Gauss.html 
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by writing the numbers 1 to 50 in a row and then the numbers 5 1 to 100 in reverse 
order underneath: 

1 2 3 4 • • • 49 50 

100 99 98 87 ••• 52 51 

Now each column adds to 101 - but there are 50 columns and so the answer is 

100 

50 x 101 = 5050. In sigma notation, Gauss calculated ^z. A natural generalisation 

i=l 

n 

is to seek a formula for ^j, i.e. the sum of the first n natural numbers. If n — 2m 

i=i 

is an even number you can use exactly the same technique as Gauss to show that 
the answer is m(2m + 1) or | n(n + 1). The same is true when n — 2m + 1 is odd 
since (using the fact that we know the answer when n is even), 

2m+l 2m 

T. i = T, i + (2m + 1) 

;=i ;= l 

= m(2m + 1) + (2m + 1) 

— (m + 1)(2 m + 1) 

= -n(n + 1). 

2 

So we’ve shown that for every natural number n 

" 1 

^z = -zr(zi+ 1). (6.2.4) 

i=i 

At this stage it’s worth briefly considering more general finite sums which are 
obtained from summing all the numbers on the list a, a + d, a + 2d , . . . , a + 
(n — 1 )d. Note that there are n numbers in this list which are obtained from the 
first term a by repeatedly adding the common difference d. Such a list is called an 
arithmetic progression and we can find the sum S of the first n terms by using a 
slight variation of Gauss’ technique. So we write 

S — a -\~ {a -f d) -*h (a 2d) -{-••• ~b [a -h (n — 1 )d) 

S — [a {n — 1 )d\ -f* [a -f ( n — 2fd) -(- [a 4~ (n — 3 )]d T* • • • *T a 

Now add these two expressions, noting that each of the n columns on the 
right-hand side sums to 2 a+ (n — l)d to get 

2S — n(2a + ( n — 1 )d), 

and so 

S = n a H — (n — 1 )d . 

2 

If you take a = d = 1, you can check that you get another proof of (6.2.4). 
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a 


1 


a 2 


«3 



Figure 6.1. Representation of some triangular numbers. 

Before we return to our main topic, I can’t resist introducing the triangular 
numbers. This is the sequence ( a n ) defined by a l = 1 , a 2 — 1 + 2 = 3, 
a 3 = 1+2 + 3 = 6, a 4 = l + 2 + 3 + 4= 10, etc. so the nth term is 
a n = |«(« + 1). Figure 6.1 above should help you see why these numbers are called 
‘triangular’. 3 

It is a fact that if you add successive triangular numbers you always get a perfect 
square so e.g. 

1 + 3 = 4 = 2 2 
3 + 6 = 9 = 3 2 
6+ 10 = 16 = 4 2 . 

It is easy to prove that this holds in general by using (6.2.4). Indeed we have that 
the sum of the nth and (n + l)th triangular numbers is 

a„ + fl„ +1 = ^n(n + 1) + ^(n + l)(n + 2) 

= ^(n + l)(n + n + 2) 

= (n + l) 2 . 


6.3 Convergence of Series 


Let’s return to the main topic of this chapter. We are given a sequence ( a n ) and we 

n 

form the associated sequence of partial sums (s„) where s n = ffa^ Now suppose 

i=i 

that ( s n ) converges to some real number l in the sense of Chapter 4, i.e. for any 
e > 0 there exists a natural number n 0 such that whenever n > n 0 we have 
| s„ — l | < €. In this case we call l the sum of the series. In some ways this is a 
bad name as l is not a sum in the usual sense of the word, it is a limit of sums, 


3 See also http://en.wikipedia.org/wiki/Triangular_number. 
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but this is standard terminology and we will have to live with it. There is also a 

OO 

special notation for l. We write it as ^a,. Again from one point of view, this is 

i=i 

a bad notation as it makes it look like an infinite process of addition, but on the 
other hand it is pretty natural once you get used to it and it works well from the 
following perspective: 

oo n 

l = } a i — lim } dj — lim s . 4 

^ ' n—^oo ' n—^oo 

i= 1 i=l 

oo 

Just to be absolutely clear that we understand what is I’ll remind you that 

i=i 

(if it exists) it is the real number that has the property that given any e > 0 there 

n oo 

exists a natural number n 0 such that whenever n > n 0 we have < e. 

i=l i=l 

When we consider this, we might conclude that ‘sum of a series’ is not a bad name 
as we can get arbitrarily close to the limit by adding a large enough number of 
terms - so adding more terms beyond the N that takes you to within e of the limit 
isn’t going to give you much more if e is sufficiently small! Indeed if the sum of 
the series exists, it is common for even the most rigorous mathematicians to write 

OO 

^ ^ ci i — T - ^3 “I” * * * (6.3.5) 

i= 1 

as though we really do have an infinite process of addition going on! There’s no 
harm in doing this as long as you appreciate that (6.3.5) is nothing more than a 
suggestive notation. The truth is in the es and Ns of the limit concept. 

oo 

If (s„) diverges to +oo we sometimes use the notation J2 a i = Similarly we 

«=i 

oo 

write = — oo when ( s n ) diverges to — oo. Also (and I hope this terminology 
2=1 

n oo 

won’t confuse anyone), if I write that J2 a i ( or e ven J2 a i) converges (or diverges) 

i=i ;=i 

this is sometimes just a convenient shorthand for the convergence (or divergence) 

n 

of (s„) where s n = XX- S° when we talk of convergent or ( divergent ) series, we 
2=1 

really mean the convergence (or divergence) of the associated sequence of partial 
sums. 

oo oo 

* Just to confuse you, many textbooks write as J2 a n which is perfectly OK as / and 

/= 1 n= 1 

n are dummy indices, but which might be a little strange at first because of the role we have 
given to n. For this reason I'm sticking to /, at least in this chapter where the ideas are new to 
you! 
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Now what about some examples? To make things simpler at the beginning, for 
the next two sections we’ll focus on nonnegative series, i.e. those for which a { > 0 
for each i - so for example (6.1.1) is included, but not (6.1.2). 


6.4 Nonnegative Series 


OK - so far all we’ve really done is to give a definition and introduce some new 
notation. Now it’s time to get down to the serious business of real mathematics. 
I’ll remind you that for the next few sections we’re going to focus on sequences 
( a n ) where each a n > 0. We consider the partial sums (s„). Our first observation 
is that this is a monotonic increasing sequence as 

n+1 n n n 

hi+i _ s n — y ' a i — y ' a i = y ' ^ ^ a «+i _ y ' = ®n+i — o> 

i=i i=i f=i ;=i 


and so s„ +1 > s n for all n. It then follows from Corollary 5.2.1 that (s n ) either 
converges (to its supremum) or diverges to +oo. Now we’ll study some examples. 

n 

We’ve seen already that ^0' = \n{n + 1) and this clearly diverges to +oo. If you 

i=i 

look at higher powers of i then their partial sums are even larger and so we should 

n 

expect divergence again, e.g. = 1 + 2 2 + 3 2 + • • • + n 2 > n 2 > n , and since 

t=i 

n 

( n ) diverges, then so does Indeed this is a consequence of Theorem 4.1.3. 

i=i 

n 

A similar argument applies to where m > 1 is any real number. What 

i=i 

n n 

about m = 0? Well here we have = ^1 = n which again diverges. Finally 

i=i i=i 

n n 

if 0 < m < 1, we have i m > 1 and so > J^l = n which also diverges. So 

i=i i=i 

n 

we can conclude that diverges to +oo for all m > 0. What happens when 

i=i 

m < 0? Let’s start by looking at the case m = —1. This is the famous harmonic 
series which is obtained by summing the terms of the harmonic sequence that we 
discussed in Section 4.1: 5 



i=i 



1 

n 


5 For the relationship with the notion of harmonic in music see e.g. http://en.wikipedia. 
org/wiki/Harmonic_series_(music) 
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A reasonable conjecture might be that this series converges as lim^^, - = 0 so as 
n gets very large the difference between s n and s„ +1 is getting smaller and smaller. 
Indeed if we calculate the first few terms we find that Sj = 1 , s 2 = \ , s 3 = ¥ , 
s 4 = , s 5 = ^,...,sowe might well believe that the sum of the series is a 

number between 2 and 3. But (as we should know by now), looking at the first 
five terms (or even the first billion) may not be a helpful guide to understanding 
" 1 

convergence. In fact diverges. To see why this is so, we’ll employ a clever 
;= i i 

argument that collects terms together in powers of 2. We look at, 6 




Now observe that 

11111 

— -H — > — T — = — , 

3 4 4 4 2 

111111111 

5 + 6 + 7 + 8 > 8 + 8 + 8 + 8 = 2 ’ 

11111111 11 

— — -f- — T — -f- — -f- — + — + — > 8. — = — , 

9 10 11 12 13 14 15 16 16 2 

and we continue this argument until we get to 

1 , 1 , , 1 _ 1 , 1 , , 1 
2" _1 + 1 + 2"- 1 + 2 + + 2 " _ 2 n ~ 1 + 1 + 2 n ~ 1 + 2 + + 2" _1 + 2 n ~ 1 ' 

There are 2"~ 1 terms in this general sum and each term is greater than ^ so 

1 1 1 „ , 1 1 

I I -j > 2 1 — - 

2 n ~ 1 + 1 2" _1 + 2 2 n ' 2" 2 

If we count the number of brackets on the right-hand side of (*) and also 

include the terms 1 and |, we find that we have ( n + 1) terms altogether and we 

have seen that n of these terms exceed f . 

6 This argument Is due to Nicole Oresme (1 3237-1 382), a Parisian thinker who eventually 
became Bishop of Lisieux- see e.g. http://en.wikipedia.org/wiki/Nicole_Oresme 
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2 " 

We conclude that ^ f > l + »-5 = 1 + | and we can see from this that the 

i=i 

series diverges. Indeed given any K > 0 we can find n 0 such that 1 + | > K for 
all n > n 0 (just take n 0 to be the smallest natural number larger than 2 (K — 1)) 

n 

and then > K for all n > 2"°. 

i=i 

At this stage you might be getting the feeling that all series are divergent. You 
can rest assured that that is far from the case. There are plenty of convergent series 
around as we’ll soon see. The next series on our list that we should consider is 

n 

but we need a few more tools before we can investigate that one. In fact 

i=i 1 

n 

we’ll need to know about the related series V tt^-t and this will furnish our first 

^ i(t+ 1 ) 
i=i 

example of a convergent series. 


Example 6.1: 


;=i 


1 

i(i+ 1) 


To show this series converges we’ll first find a neat formula for the nth partial 
sum. To begin with, you should check by cross-multiplication that 


1 

i(i+ 1) 


i+ 1' 


n 

Next we write J2 as a ‘telescopic sum’: 7 
1=1 



= 1 , 

n + 1 


after cancellation. So we conclude that 


E 


i 

i(i+ 1) 


1 

1 . 

n + 1 



7 So called because the series can be compressed into a simple form by cancellation. The 
analogy is with the collapse of an old-fashioned telescope. 
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However, we know that the sequence whose nth term is converges to 0 and 

so we find that 

o° n i 

V = lim Y = 1 - lim = 1. 

l(l + 1) n-* oo l(l + 1) n^-oo n + 1 

i=l v ' i=l v ' 

This is our first successful encounter with a convergent series so we should 
allow ourselves a quick pause for appreciation. In this case we also have an example 
where the sum of the series is explicitly known. This is in fact quite rare. In most 
cases where we can show that a series converges we will not know the limit 
explicitly. 

At this stage it is worth thinking a little bit about the relationship between 
the sequences ( a n ) and (s„) from the point of view of convergence. The last two 

n n 

examples we’ve considered are h and ^ . In the first of these we have 

i= 1 i= 1 

a n = f and we know that lim,,^,^ ^ = 0, but we’ve shown that (s„) diverges. In 
the second example, a n = n ^ +1) and again we have lim„^ 00 n ^ +1 ^ = 0, but in this 
case (s„) converges as we’ve just seen. It’s time for a theorem: 


Theorem 6.4.1. If J2 a i converges then so does the sequence (a n ) and lim,,^^ 

i=i 

a„ = 0. 

Proof. Suppose that lim„^ OQ s n = l then we also have lim,,^^ s,,.! = l (think 
about it). Now since 


hi hi— 1’ 

for all n > 2, we can use the algebra of limits, firstly to deduce that ( a n ) converges 
and secondly to find that 

lim a n — lim s n — lim s n _ 1 

n—^oo n—^oo n—>oo 

= 1-1 = o , 


and we are done. 


□ 


It’s important to appreciate what Theorem 6.4.1 is really telling us. It says 
that if (s„) converges then it follows that (a„) converges to zero. It should not be 
confused with the converse statement: 'if (a n ) converges to zero then it follows 
that (s„) converges’ which is false - and the case a n = \ provides a counter- 
example. On the other hand, one of the most useful applications of Theorem 6.4.1 
is to prove that a series diverges, for if we use the fact that a statement is logically 
equivalent to its contrapositive, 8 we see that Theorem 6.4.1 also tells us that if ( a n ) 


The contrapositive of ‘If P then Q,' is 'If not Q then not P’. 
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does not converge to zero then ffa i diverges, e.g. we see immediately that 

1=1 i=i 

diverges since lim,,^ = hm,,^ — W = 1 by algebra of limits. 

By the way, we said that we’d only consider nonnegative series in this section, 
but you can check that Theorem 6.4.1 holds without this constraint. 


6.5 The Comparison Test 


The theory of series is full of tests for convergence which are various tricks that 
have been developed over the years for showing that a series is convergent or 
divergent. We’ll meet a small number of these in this chapter. In fact we can’t 

n 

proceed further in our quest to show convergence of \ without being able to 

i=i 1 

use the comparison test, and we’ll present this as a theorem. The proof of (1) is 
particularly sweet as it makes use of old friends from earlier chapters. 


Theorem 6.5.1 (The Comparison Test). Suppose that (a„) and ( b n ) are sequences 
with 0 < a n < b n for all n. Then 


n n 

1. if converges then so does J2 a i> 

i= 1 i'=l 

n n 

2. if J2 a i diverges then so does 

i=i i=i 


Proof. Throughout this proof we’ll write s n = (as usual) and 

i=i 

n 

= X><- 

i=l 

1. We are given that the sequence ( t n ) converges and so it is bounded by 
Theorem 4.2.1. In particular it is bounded above and since each t n is positive, 
it follows that there exists K > 0 such that t n = \ t n \ < K for all n. Now since 
each < bj we have for all n that 

n n 

s n = j2 a i^J2 b i = t n - K ’ 

i=i i=i 

and so the sequence (s„) is bounded above. We have already pointed out 
that (s„) is monotonic increasing and so we can apply Theorem 5.2.1 (1) to 
conclude that (sj converges as required. 
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2. This is really just a special case of Theorem 4.1.3, but since I didn’t prove 
that result I will do so for this one. The sequence (s„) diverges so given any 
L > 0 there exists a natural number n 0 such that s n > L whenever n > n 0 . 
But since each b t > a ; we can argue as in (1) to deduce that t n > s n > L for 
such n and hence (t n ) diverges. □ 


We’ll now (as promised) apply the comparison test to show that 
converges. 


n 


E 


a 


" i 

Example 6.2: — 


We know from Example 6.1 that ^ T7+T] converges and so by the algebra of 


limits, so does = Ei(7TT)- 

1=1 i—1 

Now for all natural numbers i. 


2 

i(i + 1) 


1 _ i r 2 r 

i 1 i _i + 1 i . 
_ 1 pi - (i + 1) 
“ i [ id + 1) 

= i 2 (i + 1) " °' 


So if we take a,- = jr and — ^jj,wehaveO < a t < b t and so by Theorem 6.5.1 

n 

(1). L? converges. 

1=1 


n 

We’ve now shown that \ converges. But is it possible to find an exact value 

i=i ' 

for the sum of this series? The problem was first posed by Jacob Bernoulli (1654- 
1705). 9 As Bernoulli was living in Basle, Switzerland at the time the problem of 


OO 


finding a real number k such that E 4 = ^ became known as the ‘Basle problem’. 

i=i 1 

The problem was solved by one of the greatest mathematicians of all time, 


9 He was one of three brothers who all made important mathematical contributions. 
Furthermore, the sons and grandsons of this remarkable trio produced another five 
mathematicians - see e.g. http://en.wikipedia.org/wiki/BernoullLfamily. Be aware that Jacob 
is sometimes called by his Anglicised name James and should not be confused with his younger 
brother Johann (also called John). 
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Leonhard Euler 10 (1706-90) in 1735. He showed that 

oo , 

E l _ n 2 

i = 1 

2 

so that k — which is 1.6449 to four decimal places. This connection between 
the sums of inverses of squares of natural numbers and n - the universal constant 
which is the ratio of the circumference of any circle to its diameter - is really 
beautiful and may appear a little mysterious. To give Euler’s original proof goes 
beyond the scope of this book. 11 If you take a university level course that teaches 
you the idea of a Fourier series then there is a very nice succinct proof which 
uses that concept in a delightful way. 

Now that we have the comparison test up our sleeves, then we can make much 

n 

more progress in our goal to fully understand when ^ ^ converges for various 

i=i 

values of r. 

" 1 

Example 6.3: for r > 2. 

i=i 1 

All the series of this type converge. To see this it’s enough to notice that if i is 
any natural number then whenever r > 2 

i r > i 2 . 

Now by (L5) we have 

1 1 

— < — 
i r ~ i 2 

and we can immediately apply the comparison test (Theorem 6.5.1 (1)) to deduce 

n 

that y converges for r > 2. 

i= 1 


Example 6.4: forO < r < 1. 

i=i 1 

In this case we have i r < i for all natural numbers i and so again by (L5), 
— < — . Now apply the comparison test in the form Theorem 6.5.1 (2) to deduce 

n 

that ^ T diverges for 0 < r < 1. 

i=i 


10 See http://en.wikipedia.org/wiki/Leonhard_Euler 

1 1 The best account of this that I’ve come across is in Jeffrey Stopple's superb textbook A 
Primer of Analytic Number Theory, Cambridge University Press (2003). See the Further Reading 
section for more about this book. 


90 



6.5 THE COMPARISON TEST 


oo 

I’ve already told you how Euler found an exact formula for ^ \ which featured 

1=1 1 
oo 

it 2 . He also discovered exact formulae for Ev where r is an even number and 

i=i 

these are all expressed in terms of n r . I won’t write down the exact formulae here 
as they are more complicated then the case r — 2. We’ll postpone that to the next 
chapter as there is a fascinating connection with the number e which we’re going 

OO 

to study there. 12 Remarkably, very little is still known about ^ ^ in the case where 

;= i 

r is an odd number (r > 3). We had to wait until 1978 for Roger Apery (1916-94) 

OO 

to prove that is an irrational number! 

i=i 1 

n 

To complete the story of ^ n r where r is any real number, we should look at 

Z=1 

n 

Y t f°r 1 < r < 2. We’ll need a new technique before we do that and this is the 

;= l 

theme of the next section. 

Before we do that, let’s look at one more interesting series. We’ve shown that 

OO 

Y~ diverges. Now we’ll consider the sum of all reciprocals of the square-free 

i= 1 

integers : 

1,2, 3, 5, 6, 7, 10, 11, 13, 14, ..., 

i.e. those natural numbers that can never have a perfect square as a factor. 

It seems that this is a ‘smaller sum’ so it maybe possible that it converges. We’ll 
denote the generic square-free integer as i s j where the suffix sf stands for ‘square- 

OO 

free’ and consider Y ~ - 13 Remember that in Chapter 1, we showed that any 

V =1 hf 

natural number n can be written as n = i^m 2 where m < n is a natural number. 


°o l 

Theorem 6.5.2. — diverges. 


V=1* 


Proof. The result more or less follows from the inequality 


£-7 ~ 



12 Ifyou can't wait, try looking in Section 6.2 ofStopple's book which is cited in footnote 1 1 . 

13 It may be that a better notation for this is X 77 . which we define to mean precisely 

i sf <oo s 

lim n ^oo E T 

l s f<D 
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To see that this holds you need only use the fact that for any 1 < i < n, \ — d-i 

sf Wl 

where m < n and i s j < n, so j certainly appears in the product of the sums on 

OO 

the right-hand side - and that is enough. We’ve seen that \ converges (to a 

m= 1 

positive number which is the supremum of the sequence of partial sums) and so 
we can make the right-hand side of the last inequality larger to obtain 

£!*(££) (£i)=‘££- 

>= 1 \'V =1 / ' m=I ' 's/ =1 

00 2 

where k = 7 (which we pointed out earlier is in fact equal to 

m= 1 

oo 

From this and Theorem 4.1.3 we see that the divergence of — follows from 

v =1 ' s/ 

OO 

that of - • □ 


Corollary 6.5.1. There are infinitely many square-free integers. 

Proof. Suppose by way of contradiction that there was a largest square-free 
integer I s jr. Then 


“ 1 11 


and this is a finite sum of numbers. This contradicts the result of 
Theorem 6.5.2. □ 

Since all prime numbers are square-free you may wonder whether the sum 
of all i (where p is prime) converges or diverges. Well come back to settle that 
question in Chapter 8. 


Geometric Series 


In this section well look at series such as 

1111 
1+ 2 + 4 + 8 + 16 + ’' 
6 + 18 + 54 + 162 + 486 - 


(6.6.6) 

(6.6.7) 


Both of these are examples of geometric series, i.e. they are of the form 

OO 

ffar' = a + ar + ar 2 + ar 3 + • • • The number a is (surprise, surprise) called the 
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first term and r is called the common ratio. 14 So in (6.6.6) a — l and r — 

In (6.6.7), a = 6 and r — 3. You might have guessed that (6.6.7) diverges and 
speculated that (6.6.6) may well converge. To investigate this in the general case 

n 

we’ll first find a general formula for the «th partial sum s n — Y ar ”. 15 This finite 

i=0 

series is sometimes called a geometric progression. To find the general formula we 
first note that if r = 1 then 

s n = a -f- a -f- • • • T a = {n -f- 1 )a. (6.6.8) 

Now assume that r ^ 1 to find that 

s n — a + ar + ar 2 + b ar n . (6.6.9) 

Then multiply both sides of (6.6.9) by r to obtain 

rs n — ar + ar 2 + ar 3 + ar n + ar n+l . (6.6.10) 

Notice that all the terms on the right-hand sides of (6.6.9) and (6.6.10) are 
common to both equations, except the first term a in (6.6.9) and the last term 
ar n+l in (6.6.10). If we subtract (6.6.10) from (6.6.9) we obtain 

(1 — r)s n — a — ar n+l , 


and since r ^ 1, it is legitimate to divide both sides by 1 — r to get the formula we 
are seeking: 


s 


n 


a(l — r" +1 ) 
1 - r 


(6.6.11) 


For example, you can use this to quickly calculate the sum of the first 10 terms 
in (6.6.6) by spotting that a — 1 and r — \ in this case. So we want 



You can use a similar argument to deduce that s n — 2 — and this sequence 
clearly converges to 2. 

More generally if |r| < 1 (so — 1 < r < 1) then, by Example 4.2, we have 
lim IMOO r” = 0. Hence if we apply the algebra of limits in (6.6.11) we see that the 
geometric series converges and 

OO 

(6.6.12) 


2 >" = 


1Z| To understand why the word 'geometric' is used here go to the section “Relationship to 
geometry and Euclid’s work” in http://en.wikipedia.org/wiki/Geometric_progression 

15 Note that we start at / = 0 here instead of / = 1 , but this introduces no new difficulties in 

n n + 1 

finding limits. If you want to you can even systematically replace Y ar ' with Y ar '~ 1 However 

/=o /=i 

do bear in mind thats„ is now the sum of the first n + 1 terms. 
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In fact it is easy to check using (6.6.8) and (6.6.11) that the geometric series 
converges for no other value of r. Indeed if r > 1 it diverges to +oo, if r = — 1 it 
oscillates finitely between 0 and a and if r < — 1 it oscillates infinitely. 

n 

Now we are going to use the geometric series as a tool in proving that ^4 

i=i 

converges for 1 < r < 2. We’ll use a similar trick to the one we employed to prove 

n 

that Y 4 diverges. We’ll write the natural numbers in the form 

/=i 


2°, 2° + 1, 2 1 , 2 1 + 1, 2 2 , 2 2 + 1, 2 2 + 2, 2 3 - 1, 2 3 , . . . 


Now consider all the terms written in this form that lie between 2'” 1 and 
2' — 1 (including these two ‘end-points’) where i is an arbitrary natural number. 
There are exactly 2' -1 such numbers (think about it - and look at the case i — 3 
from the list above for inspiration if necessary). Now define 

1 1 1 1 

‘ ~ (2‘~ l y + (2 '— 1 + l) r + (2 i_1 + 2) r H h (2' - l) r ' 

Since . E,. is the largest number which appears on the right-hand side we get 


b t < 2 


i—i 


(2 i - l y 




and so X>; < E (^r)’ 

i=i i=i 

But the series on the right-hand side is a geometric one having first term 1 and 
common ratio Er < 1 since r > 1. So this geometric series converges and hence 

n n 

by the comparison test so does EE But this series is nothing but y rewritten 

i=i i=i 

in a clever way that involves powers of 2. And that’s it. 16 

OO 

We have now completed the story of ^z r where r is a real number. We have 

i=i 

shown that the series diverges if r > — 1 and converges if r < — 1 . The ‘parameter’ 
r plays a similar role here to the temperature of substances like water. Water is 
frozen solid for temperatures below 0 degrees Celsius but at that temperature it 
melts to form a liquid and this is called a ‘phase transition’ in physics. If we keep on 
increasing the temperature then the water stays liquid until we get to 100 degrees 
Celsius, when it changes to a gas and this is a second phase transition. By analogy 
we can regard the value r — — 1 as indicating a phase transition between the 

n 

regions where the series E*' r diverges and converges (respectively). This analogy 

i=i 


1 6 In fact this is an example of regrouping a series (see Section 6. 1 0) and to be completely 
rigorous, Theorem 6.1 0.1 should precede the argument we’ve just given. 
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may not be as far-fetched as it seems as analysis plays an important role in the 
mathematical modelling of real phase transitions. 


6.7 The Ratio Test 


In the last section we saw the benefits of the comparison test for proving 
convergence of series. Although it is a wonderful thing, it is by no means the 
only tool that we need for playing the series game and indeed there are many 

OO 

important series such as where it doesn’t help at all. 17 In this section we’ll 

i= 1 

develop another useful test called the ratio test} 9. Before we prove this, it will be 
helpful to make some general remarks about infinite series. 

n 

Suppose that we have a finite series where each a f > 0. We can split the 

i=i 

sum at any intermediate point 

n m n 

j2 a i = J2 a > + E a o (6.7.13) 

i= 1 i=l i=m+\ 


n 

Can we do the same for infinite sums? Suppose that converges to /. Now 

i= 1 
n 

consider the sequence whose nth term is a i- This series also converges by the 

i=m+ 1 

comparison test as we have (using the notation of Theorem 6.5.1) b { < a ; for each 
i where b { = 0 if 1 < i < m and b, = a t if i > m + 1. It is then natural to define 

oo n 

E a- — lim > a-. 

1 n — >-oo t—' 1 

j'=m+l i=m+l 


Applying the algebra of limits in (6.7.13) we find that 

oo / n m \ 


i=m+l 


,i'=l i=l 


= E a .-E' 

<=i i=i 


17 Remember that/! = /(/ — 1 )(/ — 2) - - - 3.2. 1 . 

1 8 Sometimes called d'Alembert’s ratio test in honour of the French thinker Jean d'Alembert 
(1 71 7-83) who was the first to publish it - see http://en.wikipedia.org/wiki/Jean_le_Rond_ 
d'Alembert 
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and so 


X a ‘ = X a ‘ + X a ‘ ■ (6.7.14) 

j=l 1=1 i=m+l 


oo oo 

Notice from this that to show converges it’s enough to prove that ^ a, 

1=1 i=m + 1 

converges for any fixed m which can be as large as we like. This makes sense from 
an intuitive point of view as it’s the ‘tail’ of the series, i.e. it’s value beyond a certain 
point, that determines convergence. 

We’re now ready to describe the ratio test and as usual we’ll state the test as a 
theorem. The proof of this gives us another opportunity to appreciate the value 
of geometric series. 

Theorem 6.7.1 (The Ratio Test). Suppose that ( a n ) is a sequence of positive real 
numbers for which lim„ . „ = l, then 

«— >oo a n 


• converges if / < 1. 

2=1 

n 

• H a i diverges if l > 1. 

i=i 

• If l = 1 the test is inconclusive and the series may converge or diverge. 

Proof. First suppose that l < 1 and notice that we can then find an € > 0 such 
that 


l + € < 1 . . . (i), 

indeed no matter how close to 1 the real number l may get, there is always some 
gap and choosing e = |(1 — /), for example, will only fill half of that gap. 

Now let’s go back to the definition of the limit of a sequence. Since (yi y 
converges to l and given the value of e that we’ve just chosen to satisfy (i), we 
know there exists a natural number n 0 such that if n > n 0 then I ^±4 — /I < e, i.e. 


I- 


€ < 


■Ai-l-l 


<l + € ... (ii) 


Now we can write each 


a n a n-\ 


-* 110+2 


«0+l' 


u n-l u n —2 “no +1 

Each ratio on the right-hand side satisfies (ii) and so we have 

a n <{l + e) n - n °- l a no+l . 
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since there are precisely n — n 0 — 1 ratios on the right-hand side. Now a +1 is 

OO 

a fixed number and (l + e)' i_ "o _1 a„ 0+1 ' s a geometric series with first term 

n=no+l 

a„ 0+1 and common ratio l + e. This converges by (i) and so by the comparison 

OO 

test (bearing in mind (6.7.14)), J2 a n converges. 

n= 1 

Now suppose that l > 1 and choose e = / — 1 in the left-hand side inequality 
of (ii). Then we can find m 0 such that if m > m 0 then ^4 > i > j. e . a m+] > a m . 

OO 

But then we cannot possibly have lim,,,^^ a m = 0 and so a n diverges by 

n= 1 

Theorem 6.4.1. 19 

n 

To see that anything can happen when l — 1 consider how behaves as 

i=i 

n — > oo in the two cases a„ — - and a„ — X. □ 

n n n n z 

Note that Theorem 6.7.1 assumes implicity that we are dealing with a series for 
which lim^^ exists. If it doesn’t then we cannot apply the ratio test (at least 
not this form of it, see Exercise 6.1 1 for a more general version). 


Example 6.5: 


E 


x‘ 

7T 


We’ll use the ratio test to examine the convergence of the series V where x 

i=i 

is an arbitrary positive number. This series will play an important role in the next 
chapter when we’ll be learning about the irrational number that is denoted by e. 
The ratio test is easy to apply in this case, we have 


x n x n "I”* 

- ' a »+i - („ + !)!’ 


and so 


*n + 1 




n\ 


x.nl 


( n + 1)! x” ( n + 1 )n\ 


0 as «-> oo, 


irrespective of the value of x. So l — 0 < 1 and we conclude that the series 
converges for all values of x. This proof has given us a lot. Not only do we know 

n 

that ^2 1 converges (in fact, as we’ll see in Chapter 7, the sum of the series is the 
i= 1 ‘ 

n 

special number e) but also that, e.g. 1Q2 ? 39 converges. 

1=1 


19 Recall that I told you that this result would be helpful in proving divergence. 
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6.8 General Infinite Series 


So far in this chapter we’ve only considered infinite series that have nonnegative 

n n 

terms. But what about more general series such as ^(—1)' or where x is an 

i=i ;=i 

arbitrary real number? Let’s focus on the first of these. If we group the terms as 

(— 1 + 1 ) + (—1 + 1 ) + (—1 + 1 ) H 

then it seems to be converging to zero but if we write it as 

1 + (-1 + 1 ) + (-1 + 1 ) + (-1 + 1 ) + ■ ■ ■ 

it looks like it converges to 1. But in fact this series diverges by Theorem 6.4.1. 
We should take this as a ‘health warning’ that dealing with negative numbers in 
infinite series might lead to headaches. We’ll come back to regrouping terms in 
series later in this chapter. N ow let’s focus on the general picture. We are interested 

n 

in the convergence (or otherwise) of a series ffa, where ( a n ) is an arbitrary 

i=i 

sequence of real numbers, so we’ve dropped the constraint that these numbers are 
all nonnegative. It’s a shame to lose all the knowledge we’ve gained in the early part 

n 

of this chapter so let’s introduce a link to that material. To each general series J2 a i 

i=i 

n 

we can associate the nonnegative series Here’s a key definition. The series 

*=i 

n n 

^2 a { is said to be absolutely convergent if XlKI converges. So, for example, the 

i= 1 i=l 

n n n 

series £](— l) i+1 \ is absolutely convergent since 1(— 1)' +1 \ I = converges. 

i=i ' f=i ' i=i 1 

Now all we’ve done so far is make a definition. The next theorem tells us why this 
is useful. 

Theorem 6.8.1. Any absolutely convergent series is convergent. 

n 

Proof. We want to show that the sequence whose nth term is s n — 

;= i 

converges. Suppose that it is absolutely convergent. Then the sequence ( t n ) 

n 

converges where t n — ^|a ; |. Now each \a { \ > a ; and|a ; | > —a,- (recall Section 3.4) 

;=i 

and so 

0 < «,• + |a ; | < 2|flj|. 
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n 

By algebra of limits, 2^|aJ converges and hence by the comparison test so does 
i= 1 
n 

u n = J2( a i + I «;|). Then by algebra of limits again we have convergence of s n = 
1=1 

m„ — f„ and our job is done. □ 

n 

By Theorem 6.8.1 we see immediately that 1)' +1 ^ converges. Next we 

i=i 

have an important example that picks up an earlier theme. 


\ ^ ^ 

Example 6.6: ^ — 

i=i l ' 

We can now show that this series converges when x is an arbitrary real number. 
Indeed we’ve already shown this when x > 0. If x < 0 then 

V - ' * _ I* I _ M 

;! — j-| — Z_i i\ 


and since |x| > 0 this last series converges. So is absolutely convergent and 

i=i 

hence is convergent by Theorem 6.8.1. 


Both of the tests for convergence that we’ve met can be souped up into tests 
for general series. I’ll state these but if you want proofs, you’ll have to provide the 
details (see Exercise 6.12). Both cases are quite straightforward to deal with. 

The Comparison Test - general case. If ( a n ) is an arbitrary sequence of real 
numbers and (b n ) is a sequence of nonnegative numbers so that \a n \ < b n for all 

n n 

n then if Y^b n converges so does J2 a i- 

i= i i=i 

The Ratio Test - general case. If (a„) is a sequence of real numbers for which 

n n 

lim^^ — l, then if l < 1, converges, if l > 1, ^a ; diverges and if 

1=1 1=1 

l — 1 then the test is inconclusive. 


6.9 Conditional Convergence 


W e’ve seen in the last section that every absolutely convergent series is convergent. 
In this section we’ll focus on the convergence of series that may not be 

n 

absolutely convergent. For example, consider the series 1) ,+1 f which begins 

i=i 

1 — T_|_i — i_|_i — ... Thj s series is certainly not absolutely convergent 
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n 

as we’ve already shown that the harmonic series ^ - diverges. But the partial 

i= 1 

sums of this series will certainly be smaller than those of the harmonic series, 
so perhaps there is a chance that it will converge. Before we investigate further 
we’ll need another definition. If ( a „ ) is a sequence of real numbers for which 

n n n 

ffa i converges but |a ; | diverges, then the series ffa^ is said to be conditionally 

i=i i=i i=i 

convergent. We’ve not met any conditionally convergent series yet but the next 
theorem, which gives us another test for convergence, will give us the tool we 
need to find them. This convergence test is named after Gottfried Leibniz (1646- 
1716), 20 who was a renaissance man, par excellence! In his well-known book 
Men of Mathematics 21 that gives a series of short biographies of famous (male) 
mathematicians, E.T. Bell writes, ‘Mathematics was one of the many fields in 
which Leibniz showed conspicuous genius: law, religion, statecraft, history, logic, 
metaphysics and speculative philosophy all owe to him contributions, any one of 
which would have secured his fame and preserved his memory’. 


Theorem 6.9.1 (Leibniz’ Test). Let (a„) be a sequence of nonnegative numbers 
that is 


(a) monotonic decreasing, (b) convergent to zero. 

n 

In this case the series 1 ) l+l a t converges. 

1=1 

n 

Proof. Let s n = ^(— 1 ) ,+1 a ; . Let n be even so that n = 2m for some m. Then 
1=1 

S 2m = ( fl l - a l) + (« 3 — tf 4 ) H 1" («2m-l ~ fl 2m)' 

It follows that 

^2m+2 ^2(m+l) $2m (^2m+l ^2 m-t-2^ — ^2m ' 

since (a n ) is monotonic decreasing. This shows that the sequence (s 2 „) is 
monotonic increasing. It is also bounded above since 

S 2m = a l — («2 — a s) — («4 — a 5 ) — . . . — (a 2m _2 ~ «2m-l) — a 2 m 
< a 1 . 

Here we’ve used the fact that the sequence («„) is nonnegative so that a 2m > 0 
and that it is also monotonic decreasing, so each bracket is nonnegative. We can 


20 See http://en.wikipedia.org/wiki/Gottfried_Leibniz 

21 This famous book was first published in 1 937. The most recent edition is in Touchstone 
Press (Simon and Schuster Inc.) 1 986. 
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now apply Theorem 5.2.1 (1) to conclude that (s 2m ) converges and we define 
s = lim m _ > . 00 s 2m . We now know that even partial sums converge. What about the 
odd ones? Well by algebra of limits we have 

lim Wl = lim S 2m + lim «2m+l = « + 0 = S. 
m— >oo m—^oo m— >oo 

We’ve shown that lim m ^, 00 s 2m = s and lim„ I ^ 0C s 2m+1 = s. To prove the 
theorem we need to show that the full sequence (s„) converges. It seems feasible 
that if it does, then lim IMOO s n = s and this is what we’ll now prove. It’s about time 
we had an e and an n 0 again, so let’s fix e > 0. Then from what has been proved 
above, there exists n 0 such that if m > n 0 then |s 2m — s| < e and there exists n l 
such that if m > n, then |s 2m+1 — s| < e. Now let n > max(n 0 , n-J. Then either 
n is even and so n = 2m for some m, or n is odd in which case n — 2m + 1. In 
either case we have |s„ — s| < e and the result is proved. □ 

n 

It is very easy to apply Leibniz’ test to see immediately that ^(— 1)' +1 i 

i= 1 

converges and this gives us a nice example of a conditionally convergent series. 
In fact this is an example of a series where the sum is known and it is log e (2) (the 
logarithm to base e of 2 whose decimal expansion begins 0.6931471). The proof 
uses calculus which goes beyond the scope of this book but I’ve included a sketch 
below for those who know some integration (and as an incentive to learn about it 
for those who don’t). 

We start with the following binomial series expansion which is valid for 

— 1 < x < 1: 


( 1 + x )- 1 = £(-!)" 


= 1 — x + x 2 — x 3 + x 4 — 


Now integrate both sides (the interchange of integration with infinite 
summation on the right-hand side needs justification) to get 


log e (l +x) = x- — + — -— + 


= E<-D 


n+ 1 '' 


Because of the constraint on x we can’t just put x — 1 (tempting though this 
may be) but after some careful work, it turns out that you can take the limit as 
x -* 1 (from below) and this gives the required result. 
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6.10 Regrouping and Rearrangements 


There are two ways in which we can mix up the terms in an infinite series - 
by regrouping and by rearrangement. In the first of these we add the series in a 
different way by bracketing the terms differently (I did this earlier for the series 

n 

f2(— 1)0 but we don’t change the order in which terms appear. In the second, 

i=i 

we mix up the order in which terms appear as much as we please. For example, 

n 

consider the series ^^ = l + | + | + ^ + ^ + -- -.An example of a regrouping 

i=l" 

of this series is 



469 
3600 H 


and here is a rearrangement of the same series: 


11 111 1 


9 25 36 4 49 36 


If a series is convergent then regrouping can’t do it much harm but 
rearrangements can wreak havoc as we will see. First let’s look at regrouping: 


Theorem 6.10.1. If a series converges to l then it continues to converge to the 
same limit after any regrouping. 


Proof. Suppose the series ffa i converges to l. We’ll write a general regrouping 

i=i 

of the series as follows: 

b l — a l + « 2 ^ + a m i 

^2 = a mi+l + a m 2 + 2 H h a m 2 


b 


n 


a m n - 1 + 1 a m„_ i+2 + ' ' ' + 


Then ffb, = ffa r and we have m n > n. Since the original series converges we 

/= 1 r — 1 


know that given e > 0 there exists n 0 such that if m n > n 0 then 


X>; - 1 

r= 1 


< €. 


Now if n > n 0 we must have m n > n Q and so 
required convergence. 


Lb,- - 1 


i= 1 


< e which gives the 

□ 
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The contrapositive of the statement of this theorem tells us that if a series 
converges to more than one limit after regrouping then it diverges and this gives 

n 

us another proof that ^(—1)" diverges, as we’ve already shown how to group it 

i=i 

in such a way that it converges to 0 and to 1. 

Rearrangements are more complicated. We won’t prove anything about them 
here but will be content with just stating two results which are both originally 
due to the nineteenth-century German mathematician Bernhard Riemann 
(1826-66): 22 


n 

• If is absolutely convergent to l then any rearrangement of the series is 

i=i 

also convergent to the same limit. 

n 

• If "^2 a f is conditionally convergent then given any real number x, it is possible 

;=i 

to find a rearrangement such that the rearranged series converges to x. In 
fact rearrangements can even be found for which the series diverges to +oo 
or — oo. 


The second result quoted here is quite mindboggling and the two results taken 
together illustrate that there is quite a significant difference in behaviour between 
absolutely convergent and conditionally convergent series. To see a concrete 
example of how to rearrange conditionally convergent series to converge to 
different values look at pp. 177-8 of D.Bressoud A Radical Approach to Real 
Analysis (second edition), Mathematical Association of America (2007). 23 

We’ll close this section with a remark about divergent series. You may think 
that once a series has been shown to diverge then that’s the end of the story. In 
fact it can sometimes make sense to assign a number to a divergent series and 
even refer to it as the ‘sum’ - where ‘sum’ is interpreted in a different way from 
the usual. For example consider a slight variation on our old friend (6.1.2) - the 

n 

series £](— 1)' +1 . The great Leonhard Euler noticed that this is a geometric series 

i=i 

with first term 1 and common ratio —1. Even though the formula (6.6.12) is not 
valid in this context, Euler applied it and argued that the series ‘converges’ to |. 
Euler’s reasoning was incorrect here but his intuition was sound. If you redefine 
summation of a series to mean, taking the limit of averages of partial sums rather 
than partial sums themselves, then this is precisely the answer that you get. If you 
want to explore divergent series further from this point of view then a good place 
to start is Exercise 6.19. See also http://en.wikipedia.org/wiki/Divergent_series. 


22 See http://en.wikipedia.org/wiki/Bernhard_Riemann 

23 This book is briefly discussed in the Further Reading section. 
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6.1 1 Real Numbers and Decimal Expansions 


In this book we’ve adopted a working definition of a real number as one 
that has a decimal expansion. But what do we really mean by this? Since 
a 0 .a 1 a 2 « 3 fl 4 • ■ • = «o + 0.a 1 a 2 a 3 a 4 • • • if the number a 0 .a 1 a 2 a 3 a 4 • • • is positive 
and a 0 — 0.a 1 a 2 a 3 a 4 • • • if it is negative, we might as well just consider those real 
numbers that lie between 0 and 1 for the purposes of this discussion. Now 


0 £? 3 £? 4 ‘ * ‘ 


d i 

To 


Cl') do d/\ 

— A 1 — A 

100 1000 io 4 


so we can see that decimal expansions are really nothing but a convenient 

OO 

shorthand for convergent infinite series of the form ^ As we’ve already 

n= 1 

remarked, a rational number either has a finite decimal expansion such as 
T = ^ + yjq + jggg + • • • or the a n s are given by a periodic (or eventually 

OO 

periodic) pattern such as | = ^ ^ so each a n — 3 in this example. 

n= 1 

By the way, it appears that we have privileged the number 10 in this story 
but (as discussed in section 2.1) that is just a matter of convenience and collective 
habit. We could just as easily work in base 2 for example and represent all numbers 

OO 

between 0 and 1 as binary decimals ^ e.g. 1 = 0. 1 in this base and | = 0.6i = 

n= 1 
oo 

JT . This is of course a geometric series and you should check that it has the 

n= 1 

right limit. We stick to base 10 from now on because we’re used to it (recall the 
discussion in Section 2.1). 

An interesting phenomenon occurs with numbers whose decimal expansion 
is always 9 after a given point, so they look like x = a 4 a 2 ■ ■ ■ a N 999 ■ ■ ■ = 

OO 

a 1 a 2 ■ ■ ■ a N 9 = a l a 2 ■ ■ ■ a N + ^ • 

r=N + 1 
oo 

Let’s focus on JT This is a geometric series whose first term is ^ +1 and 

r=N+l 

common ratio is yj. So it converges to 


9 


10 N +! 



9 

IqN+1 

_ 9 _ 

10 


1 

To^' 


This means that x = a x a 2 ■ ■ ■ a N + 0.00 • • • 01 where the 1 is in the Nth place 
after the decimal point. So we can write x — a x a 2 ■ ■ ■ a N , where a N r = a N + 1, e.g. 
0.3679 = 0.368 and 0.999999 • • • = 0.9 = 1. 

Generally two distinct decimal expansions that differ in only one place give 
rise to different numbers. The phenomenon that occurs with repeating nines is 
a very special one where the notation breaks down and appears to be giving you 
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two different numbers that are in fact identical. From a common-sense point of 
view this maybe quite obvious as e.g. j = 0.3 and 1 = 3 x | = 3 x 0.3 = 0.9. 

OO 

Just before we conclude this chapter we may ask whether every series Y w' 

n= 1 

converges to give a legitimate decimal expansion. Here the a n s are chosen from 
0, 1, 2, . . . , 9. It’s easy to establish convergence. Since each a n < 9 we can use the 

OO 

comparison test as ^ = 1 . Thus we have a complete correspondence between 

n= 1 

oo 

numbers that lie between 0 and 1 and infinite series of the form Y yyr- 

n= 1 


6.12 Exercises for Chapter 6 


1 . Investigate the convergence of the following series and find the sum whenever 


this exists 




(a) 

(b) 

OO 

y n (n+]) 

OO 

(<=) E 


n= 0 

n= 1 

n= 1 

(d) 

OO 1 

y 1 , 

“, n (n + 2 ) 

OO 

(e) ^sin n (6») 

where 


(n + 1 )(n + 2) 


OO 

2. Use known results about sequences to give thorough proofs that if °n anci 

n= 1 
oo 

Y b n both converge then, 

n= 1 

OO 

(a) Y(o n + b n ) converges, 

n= 1 

OO 

(b) xY a n converges for all real numbers X. 

n= 1 

Formulate and prove similar results which pertain to divergence in the case 
where both series are properly divergent to +oo or to -oo. Why doesn’t (a) 
extend to the case where one of the series is properly divergent to +oo while 
the other is properly divergent to -oo? 

OO OO 

3. Show that if Y a n converges then lim^^ Y a n = 0. 

n= 1 n=N 

A. (a) Use geometric series to write the recurring decimal 0. 1 7 as a fraction in its 
lowest terms. [Hint: 0.17 = ^ + ^ + • • • .] 
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(b) Suppose chat c is a block of m whole numbers in a recurrent decimal 
expansion O.c (e.g. if c = 1234 then m = 4 and O.c = 0.1234). Deduce 
that O.c = totett- 

5. (a) Show that each of the series whose nth term 2 ' 1 is given below diverges 

(i) (1 +e) n where e > 0 (ii) n + 1 

n + 2 

(b) Find all values of e for which the series whose nth term is (-1 +e) n 


(i) converges, (ii) diverges. 


6. Use the comparison test to investigate the convergence of Che series whose 
nth term is as follows: 


(a) 


1 


1 + n z 


(b) 


1 + cos(n) 


(c) 


n l - 1 


(d) 


n d + 1 


(e) 


2 + sin(n) 


7. Use the ratio test to investigate the convergence of the series whose nth term 
is as follows: 


(a) 


m 2 

(2n)!’ 



(c) 


n! 
~rf ’ 


(d) 


n n 

~n\' 


Note that the solutions to (c) and (d) require some knowledge of the number e, 
which is discussed in Chapter 7. 


8. Use any appropriate technique to investigate Che convergence of the series 
whose nth term is as follows: 


(a) 


1 

~rf’ 


(b) \/n + 1 - sfn, 


(c) 


1 

\f n{n | =()’ 


(d) 


n!(n + 4)1 
(2n)! 


Once again for (a) it helps if you know about e (or alternatively do Problem 1 3 
first). 


9 


OO 

Show that if J2 a n is a convergent series of nonnegative real numbers and ( b n ) 

n= 1 


OO 

is a bounded sequence of nonnegative real numbers, then the series J2 °nbn 

n= 1 


also converges. 


OO 

1 0. Show that if J2 a n ' s a convergent series of positive real numbers then the 

n= 1 

OO 

series J2 s/ a n a n+ 1 is also convergent. 
n + 1 


2Z| Here and below by the 'nth term of the series’ is meant a n in X) o n . 

n= 1 
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11. 


Prove the following more powerful form of the ratio test which does not assume 
that lim n ^ TO ^ exists: 

Let (a n ) be a sequence of positive numbers for which there exists 0 < r < 1 


OO 


and a whole number n 0 such that forall n > n 0 , < r then a n converges. 

n~. I 


If > 1 forall n > n 0 then a n diverges. 


1 2. Give thorough proofs of the comparison test and the ratio test for series of the 

OO 

form a n where (a n ) is an arbitrary real-valued sequence. 

n= 1 


1 3. Although they are the best known, the comparison test and the ratio test are 
not the only tests for convergent series. Another well-known test is Cauchy’s 
root test. This states that if (a n ) is a sequence with each a n > 0 and if there 

OO 

exists 0 < r < 1 with tfcT n < r for all n then £ o n converges, but if y^fdC n > 1 

n 1 

OO 

for all n then J2 a n diverges. To prove this result: 
n= 1 

(a) Assume tfcT n < r with 0 < r < 1 . Show that a n < r n for all n and 

OO 

hence use the comparison test with a geometric series to show J2 a n 

n= 1 

converges. 

(b) Suppose that y % > 1 for all n. Deduce that lim^^ a n = 0 cannot hold 

OO 

and hence show that a n diverges. 

n= 1 

(c) Deduce the stronger form of the root test whereby for convergence we 
only require that ^fa~ n < r for all n > n 0 where n 0 is a given whole number 
and for divergence we ask that ^fa~ n > 1 for infinitely many n. 


14. In Section 6.6 we showed that J2 w converges wheneverr > 1. You can use a 

n= 1 

similar argument to prove another test forconvergence of series called Cauchy's 
condensation test: if [a n ) is a nonnegative monotonic decreasing sequence, then 

OO OO 

o n converges if and only if 2 n a 2 n converges. 

n= 1 n= 1 

Hint: First show that for each natural number k, 


02k T" + • • • 02k+\ _ ] 02k , 

02k_i_i + 02^+2 + • ■ • @ 2 k+ ^ — 2 ^ 02 ^+! • 


1 5. For each of the following series, decide whether it is (a) convergent, (b) 
absolutely convergent, giving your reasons in each case. 


OO 

0)E 


n= 1 


(- 1) n+1 

y/n 


OO 

(") E 


n= 1 


Mr 1 

3 n 2 - 2 n’ 


E 

n= 1 


cos(r?7r) 
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1 6. Only one of the following statements is true. Present either a counter-example 
ora proof in each case. 

oo oo 

(a) If a n converges then so does l Q J- 

n = 1 n= 1 

OO OO 

(b) If £ |a n | converges then so does X! On- 

n=i n=1 

00 oo 

1 7. Give an example of a sequence (a n ) for which y^a n converges but J2 a n does 

n= i n=1 

not. 

1 8. Recall Cauchy's inequality for sums from Exercise 3.9: If O] , a 2 , . ■ ■ , a n and 
tq , t> 2 > • • • , b n are real numbers then 

Eoa|<(Eo?) j (ec) ! . 

Now extend this inequality to series, so if (a n ) and (b n ) are sequences and we 

oo oo n 

are given that both a n ancl Y2 converge, show that ^0,5,- converges 
n = 1 n= 1 / 1 

absolutely and that 

1 1 
oo /oo \ 7 / oo \ 7 

El°AI < (I] On) (E^l • 

0=1 \n=1 / \n=1 / 

oo 

19. Suppose that diverges. It is sometimes useful to find the sum of an 

n= 1 

associated series that really converges. One example of a summation technique 
for associating a convergent series to a divergent one is Cesaro summation. 

n 

Recall the sequence of partial sums (s n ) where s n = J2 a i -We define the Cesaro 

/= i 

average of the partial sums to be s' n = l(s, + s 2 4 + s„) and if lim n ^ 00 s' n 

OO 

exists then we say that the series a n is Cesaro summable. 

n= 1 

OO 

(a) Prove that if £ ° n converges in the usual sense then it is also Cesaro 

n= 1 

summable and that both limits are the same. 

OO 

(b) Let a n — ( — 1 ) n+1 . Show that ° n is Cesaro summable and that the limit 

n= 1 

of the Cesaro averages is \ 


108 



Part II 


Exploring Limits 




Wonderful Numbers - e, ji 

and y 

Thus e became the first number to be defined by a limiting process. 

e: The Story of a Number, E. Maior 


I n this chapter we’ll look at three really interesting real numbers. Of course all 
numbers are interesting, but these three are of particular importance because 
they are so prevalent throughout science and mathematics. 


7.1 The Number e 


Suppose that you invest some money into a bank account at an interest rate r per 
annum. For example if the rate is 6% then r = 0.06. The amount you invest is 
called the principal and we denote it by the letter P. After a year has passed you 
have earned rP interest and so your initial investment of P has grown to P(1 + r). 
From now on we’ll simplify as much as possible and take P = 1 (pound, dollar, 
euro, yen - whatever you like). Now suppose that instead of interest being paid 
after a year it is paid at six-monthly intervals. Then after six months, your initial 
investment has grown to ( 1 + |) . Suppose this money is reinvested for the next six 
months. Then after a year you will have (l + |)~. By similar arguments you can 
see that if interest is paid monthly then after a year you have (l + ^ ) 1 2 , if it is paid 
daily (and we are not in a leap year) then your investment is worth (l + j^) 365 
at the end of the year and if it is paid every minute then your money grows to 
(l + 5 256 oo ) 525600 • Now the reader should calculate all of these numbers in the 
simplest case where we take r = 1. What happens if interest is paid every second - 
and every hundredth of second? You should by now have realised that you are 
calculating terms in the sequence whose nth term is (l + 1)", and you should 
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have accumulated some persuasive evidence that the sequence is converging to a 
number that is reasonably close to 2.718281828 . . . But how do we know the limit 
really exists? Well demonstrate that now. 

Theorem 7.1.1. The sequence whose nth term is (l + j-J" converges to a real 
number which we call e. We have 2 < e < 3. 


Proof. We’ll do this in stages. 

Stage 1: Our sequence is monotonic increasing 

To prove this well use an old friend - the Theorem of the Means (Theorem 
3.5.1) which told us that if we have n positive real numbers a 1 , a 2 , . . . , a n then 


ya x a 2 ---a n < 


& l + #2 + * * * + Cl n 

n 


Now well apply this result with a x = a 2 = ■ ■ ■ = a n _ 1 = 1 + ^ an d a n = 1- 

n — 1 

Then the geometric mean is (l + " and the arithmetic mean is 


("— D(l + ^l) + l _ n+l _ i | 1 
n n n 

So the theorem of the means yields tells us that 

n - 1 


1 


1 


n — 1 


< 1 + 

n 


and raising both sides to the power n gives 

n— 1 


i 


i 


n — 1 


< I 1+ - 

n 


which is exactly what we wanted to prove. 


Stage 2: Our sequence is bounded 

To prove this we first use the binomial theorem to expand 


1 


= 1 


n.- 


1 n(n —1)1 n(n — 1 )(n — 2) 1 
n 2! n 2 3! n 3 

Using a little bit of algebra we can rewrite this to get 


( 


1 + 



1 




1 

2 ! 



H h 




1 

n! 


1 

n n 


(7.1.1) 
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If you’re worried about where the last term came from, observe that when you 
simplify the product of brackets you obtain 





(n - 1)! 1 _ 1 

n’l-l n \ n n 


The formula (7.1.1) is intriguing and we’ll return to it when the proof is 
complete. For now just notice that all of the terms taking the form (l — £) which 
appear in the right-hand side of (7.1.1) are between 0 and 1 and so we can deduce 
that 


1 

1 + - 
n 


1 1 

< 1-1 1 

1 ! 2 ! 


1 

nl 


< 



i '=0 


(7.1.2) 


Now we proved that the infinite series appearing on the right-hand side of 
(7.1.2) converged, in Section 6.7. So its sum is an upper bound for our sequence. 
As the sequence is bounded above and monotonic increasing we can assert that it 
converges, by Theorem 5.2.1. Thus we can now legitimately define 

e = lim f 1 + -^ . (7.1.3) 

n^oo \ n J 

Now we need to show that 2 < e < 3. Before we do that, it’s helpful to take a 
quick diversion. 


Stage 3: A Useful Inequality 

Let’s start by looking at factorials again: n\ = n(n — 1 )(« — 2) . . . 3.2.1. There 
are n numbers multiplied together on the right-hand side and all but one of them 
is greater than or equal to 2. That tells us that n\ > 2"" 1 and so by (L5) 


Stage 4: The Bounds 

From the first line of (7.1.1), we see that 2 is a lower bound for the sequence 
(l + i)". Since e is the supremum of this sequence it then follows that e > 2. 
For the upper bound we first use the inequality just before (7.1.2) and then apply 
(7.1.4) to show that 


1 

1 + - 
n 


< 1 


< 1 


! + ^ + - 

1 1 

1 + — + — t + • ■ 
2 2 2 


1 

nl 


1 

2 n ~ 1 ' 


Now the geometric series 1 + \ + jz + • — h has first term 1 and common 
ratio f and so it converges to 2. Hence we see that 3 is an upper bound for the 
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sequence with nth term (l + 1)". But e is the supremum of this sequence and so 
e < 3 as promised. □ 

We have now defined e as the limit of a sequence. The formula (7.1.1) which 
helped us do this is intriguing. It suggests that we might also be able to write e as 
the limit of a series: 1 2 

111 1 . . 

e - 1 + T! + ^ + ^! + "'-^ (7 ‘ L5) 

n = 0 

But beware. You may be tempted to try to prove this very quickly by taking 
limits on both sides of (7.1.1); we need to proceed with caution as the right-hand 
side is quite a complicated expression. 


Theorem 7.1.2. e 


oc 

y~. 

' n! 


OO 

Proof. We see from (7.1.2) that -f is an upper bound for the sequence whose 

n= 0 

nth term is (l + 1) • But we proved in Theorem 7.1.1 that e is the supremum of 
this sequence and so 



Now consider (7.1.1) again but this time only take the first m terms on the right- 
hand side where m < n. It’s a good idea to give this quantity a name so define 


Jm) _ 

L 'n 


Then we have, 2 



We fix m for now and consider the sequence (cl m) ) = (c\ m \ \ c^"\ . . .). We 

have just shown that it is bounded above by e. It is also monotonic increasing, 
indeed this follows easily from the fact that 1 — j is mono tonic increasing for each 


1 Note that in (7. 1.5) we define 0! = 1 and this is standard throughout mathematics. 

2 m is playing the role of a label here and should not be confused with a power. That's why 
I’ve put it in brackets. 
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1 < k < m — 1. Thus we see that the sequence converges tol + l + ^ + -- - + ^ 
as n oo and since this limit is also the supremum we must have 


1 + 1 + 2! 


1 

— < e. 
m\ 


Now the argument we’ve just used holds for arbitrary m. So we see that e is an 

m 

upper bound for the sequence whose mth term is ^ -7 and it follows that 

n= 0 


E- 

^ n\ 


OO 


< e. 


00 00 

Now we’ve shown that e < V 4 and Y' -4 

— t—' n\ n\ 

n = 0 n = 0 


e 


00 


= £ 


n\ * 


< e and can only conclude that 

□ 


In fact the results of the last two theorems can be generalised. Suppose that x 
is an arbitrary real number. Then it can be shown that 


00 1. 

, / x\» x — ' x 

hrn ( 1 + - ) =) —. 

n-> 00 \ n / z — ' m! 


The surprising thing here is not so much that the two limits exist and are equal 
to each other but that they are equal to e x . This tells us for example, that 



n = 0 




which is far from obvious when you first see it. To see that this is true we really 
need to verify that 

= eV, 


for all real numbers x and y. 

For a sketch (and the reader is invited to fill in the gaps) of this important fact 
we need to use the binomial theorem (see Appendix 1) to write 

(x+y) n 




x k y 


n—k 


n = 0 k=0 v ' 
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The curve that you obtain when you vary x in e x is the graph of the exponential 
function (Figure 7.1). 

This crops up time and time again in science, engineering, economics and many 
other areas where mathematics is applied to understand the real world. Indeed 
the expression for continuously compounded interest at time t that motivated 
the work of this chapter is given by Pe rt . We could spend a lot of time on the 
function e x but for this book it’s enough to consider three things. 

(i) If x > 0 then e x is the limit (and hence the supremum) of a bounded 
monotonic increasing sequence. It follows that for all natural numbers n, 

-y-2 

e* > 1 +x+ — + ■■•+ — ■ 

2! n\ 

In particular, we’ll often need the case where n = 1: 

e x > 1 + x. (7.1.6) 

Yl yYl 

(ii) Ifl < x < y then fr < fr for all n and it follows that e x < e y . 

(iii) Sometimes we want to consider e x when x is itself quite a complicated 
expression. In that case we will use the notation expfx} in place of e x so 
that we don’t have to ask the reader to strain their eyes too much. Here 
‘exp’ stands for exponential. 


115 


7 WONDERFUL NUMBERS - e, 7T AND y 


It’s time for an interesting diversion. In Chapter 6, we looked at the finite 

n 

series ^ i — \n(n + 1). If you have taken a first year undergraduate course in 

i=i 

mathematics you might well have encountered the formulae (which are usually 
proved using the technique of mathematical induction (see Appendix 3)): 

" 1 " l 

V r = -n(n + 1)(2« + 1) and i 3 = -n 2 (n + l) 2 . 

' 6 ' 4 

i=i i=i 

There doesn’t appear to be a pattern here but there is and it was found by Jacob 
Bernoulli. He showed that for any natural number m: 


n— 1 m 


E'- = E 


m\ 


(m + 1 — k)\k\ 


-Bun 


m+l—k 


You can sum up to n on the left-hand side if you like, but that means you have 
to replace n by n + 1 on the right-hand side which makes it more complicated. 
The numbers B 0 , B 1 , . . . , B m which appear in this formula are called Bernoulli 
numbers in honour of their discoverer. The remarkable thing is that they involve 
e x , indeed the best way to define Bernoulli numbers is indirectly through the 
identity: 




e x - 1 


E*. 


x n 

n\ ’ 


so by (7.1.2) 




You can equate coefficients in this last identity to calculate the Bernoulli 
numbers, so if you compare coefficients of x on the left and the right you get 
1 = B 0 . Similarly comparing coefficients ofx 2 you find that 0 = ^+B 1 soB 1 = — |. 
Continuing in this way you can find B 2 — | , B 3 = 0, B 4 = — ^ , . . ,, 3 and hence 

n— 1 

calculate ^ i m for as high a value of m as you like. Incidentally it turns out that 

i=i 

B 2fc+1 is always equal to zero and B 2 /c +2 i s always negative. 

00 2 

In Chapter 6, we mentioned Euler’s remarkable result that \ =B 2 n 2 . 

n= 1 

He also proved the more general result 

- 1 (-pt-14*-* 

^n 2k (2k)! 2k ' 

n=i v ' 


3 See http://en.wikipedia.org/wiki/BernoullLnumber 
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This should be enough to convince you that Bernoulli numbers are rather 
useful tools to have at your disposal. 

Let’s return to the number e itself. It must be either rational or irrational. The 
answer is in the following theorem: 

Theorem 7.1.3. e is an irrational number. 


Proof. We’ll give a proof by contradiction, so we’ll begin by assuming that e is 
a rational number and so it can be written as where p and q are natural numbers. 

Now for each 1 < k < n, is a natural number, indeed you should check that 
?J = n(n — l)(n — 2) • • • (k + 1). Also if n > q, we can certainly write n\ = mq for 
some natural number m. It follows that n\e — n\^ — mp is also a natural number. 
From now on we choose n > q. From what we’ve just written, we know that N is 
an integer where 


N — n!e — 


n 


E 


n\ 
k! ' 


In fact N is a natural number since 


N = n\ 



and we know that e > f. Now using Theorem 7.1.2 and (6.7.14) we get 


k = 0 


"-'(fs-Es) 


oo 

= n\ y i 

^ k\ 


k=n + 1 

50 n\ 


E 

k=n + 1 

1 


k\ 


1 


= E 


n + 1 (« + l)(« + 2) 

"" 1 


k= 1 


( n + 1 )(« + 2) •••(« + k) 


Now since ( n + 1 )(n + 2) •••(« + k) > (n+ l) k for k > 1, we have 

^ 1 

N < ) -. 

i,( n + 

k= 1 


117 



7 WONDERFUL NUMBERS - e, 7T AND y 


The series on the right-hand side is a geometric one with first term and common 

1 

ratio both equal to So it converges to ” +1 , = 1, and so we conclude that 

N < This gives us our required contradiction. □ 

Now we know that e is irrational, is it perhaps one of those numbers that 
we have encountered previously such as the square root of a prime number? To 
answer this question we need a new definition. 

Let x be a real number. We say that it is algebraic if there exists a natural 
number n and integers, 4 a 0 , a l , . . . , a n such that 

a 0 + a : x + a 2 x 2 H + a n x n — 0, 

i.e. the number x is a solution of an equation with integer coefficients. What kinds 
of numbers are algebraic? Well to start with all rational numbers are, since if such 
a number can be written in the form £ then we simply take n = 1 , a 0 — —p and 
a 1 — q to see that —p + qx = 0. If p is any prime number (or indeed any real 
number), then Jp is also algebraic as we see by taking n — 2, a 0 — —p, a l — 
0 and 02 = 1. Are there any irrational numbers that fail to be algebraic? The 
answer to this question is yes, but unfortunately the details required to prove 
it are too complex for a book of this nature. Numbers that fail to be algebraic 
are called transcendental. The first examples of such numbers were found by 
the French mathematician Joseph Liouville (1809-82) in 1844. In particular he 

OO 

proved that ^ ^ = 0.110001 ... is transcendental. Another mathematician, 

n= 1 

Charles Hermite (1822-1901), proved that e is transcendental in 1873. His proof 
ran to more than thirty pages. Another well-known number that turns out to be 
transcendental is n and we will turn to this next. 


7.2 The Number n 


In this and the next section we’ll depart from the main philosophy of this book 
which has been to avoid using calculus. For ji and y it’s impossible to probe them 
with any degree of depth without needing to use more sophisticated tools and 
these sections should be treated by readers as an invitation to the feast rather than 
a course (not even a starter). 

It has been known since antiquity that the ratio of the circumference to the 
diameter of any (idealised) circle is constant. By idealised here I mean that the 
circle cannot be drawn by any physical instrument as its boundary has zero 
thickness. This constant has been denoted by the Greek letter n since 1706 

* In fact it’s equivalent to take o 0 , o, o n to be rational. Why 7 
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Figure 7.2. Approximation of the area of a quadrant. 


when the symbol was introduced by William Jones (1675-1749). By around 2000 
BC, the Babylonians used 3 1 as an approximation to tt and the Egyptians were 
using jr 4 (|)“. There are many ways in which tc can be obtained as a limit of a 
sequence or series. For example it is thought that Japanese mathematicians in the 
middle ages used arguments such as the following to estimate tt. 

Figure 7.2 shows a quadrant of a circle of radius 1 which is almost totally filled 
by n non-overlapping rectangles of equal width. The total area of the quadrant 
is Now let’s calculate the area of the rectangles which approximates this. 
Each rectangular strip has width 1 and by Pythagoras’ theorem, the height of 

the;th strip is J 1 — 4-. So we see that the area of the jth strip is 



Consequently we deduce that the total area of all the rectangles filling the quadrant 

n 

is dj. yjn 2 — j 2 . Now as n becomes larger and larger we can see that the 
” J= i 

rectangles fill up more and more of the quadrant and so it’s reasonable to assert 
that 


Tt 

~4 


1 " 
lim — 'Y] 


n-^-oo ft 


n 2 — j 2 . 


j= i 


and so 7 r — dlim,,^,^ \ ^ ^ n 2 — j 2 . The reader should experiment with 
" i= i 

calculating approximations to n using this formula, e.g. what do you get when 
n = 10, 50, 100? 

The invention of calculus enabled mathematicians to find more satisfying and 
direct expressions for tt involving convergent series. We’ll sketch a very famous 
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argument but beware that there are lots of gaps that need filling - even for those 
who know how to do the calculus. 

We begin with a well-known formula of integration: 


arctan(x) = 



1 

1 + y 2 


dy , 


where arctan(x) denotes the angle in radians whose tan is x. By the binomial series 
(see Appendix 1) we have 


1 

l +y 2 


n = 0 


provided \y\ < 1. Integrating term-by-term we get Gregory’s series 


X D X D X ' 

arctan(x) =x 1 h- 

3 5 7 


Now if we pass to the limit as x -» 1 we finally find that since tan ( | ) = 1 then 


TV 


= 1 


7 




, 1 1 1 

ir = 4 1 1 

3 5 7 


This formula is very beautiful but from the practical point of view of trying 
to calculate n it has a serious defect. It converges very slowly as you can find 
for yourself by calculating some partial sums. Now Gregory’s series for tt was 
first published in 1671. A series that converges much faster was obtained by Isaac 
Newton (and published in 1742). He followed a similar path to Gregory but instead 
of the integral for the inverse tangent, he used that for the inverse sine: 


arcsin(x) = 


f 


V 1 ~r 


-.dy. 


Now expanding (1 — y 2 ) 2 as a binomial series and integrating term-by-term 


yields (for — 1 < x < 1) 

lx 3 1.3 x 5 

arcsin(x) = x H 1 

2 3 2.4 5 

and using the fact that arcsin ( |) = f we get 


1.3.5 x 7 

2.4.6 y 


Tt — 6 


1 

2 


+ 


1 

23 



1.3 

2.4.5 


1.3.5 , 

(i \ 7 

— 

- H 

2. 4. 6. 7 ' 

^ 2 ; 


and you should compute the first few terms of this and convince yourself that 
it really does appear to converge faster to it than Gregory’s series does. 

We have shown that e is irrational. Now we’ll look at the corresponding result 
for Tt. As the proof is much harder I’m only going to sketch an outline. In fact 
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the proof is slightly indirect in that we’ll aim to show that it 2 is irrational. The 
irrationality of n itself then follows from I(iii) in Chapter 2. 

Theorem 7.2.1. tt 2 is an irrational number. 

Proof. We proceed to give a proof by contradiction so let’s assume that n 2 — 
where p and q are natural numbers. We’ll split the proof into steps and start at 
what seems to be a highly irrelevant observation. 

Step 1: A Curious Sideline. Fix a natural number n (which later on we will want 
to make large) and define the function 


/(*) = 


(7.2.7) 


for all 0 < x < 1. It’s worth making a note of the inequality 


/M < 


(7.2.8) 


for all 0 < x < 1. 

We denote the first and second derivatives of / by /' and f" and the kth 

n 

derivative byf <k> . Using the binomial theorem we have (1 — x) n = {") (—x) n ~ r 

r=o r 

where the binomial coefficients (") = ( n "l)\ r \ ( see Appendix 1). It follows that 


/(*) = 


(-i)" 


n\ 


and if we substitute s = n — r we obtain 


5=0 V 7 5=0 


(7.2.9) 


where c s = (-l) s („" s ). 
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Now from (7.2.7) we have f(x) = /(I — x ). From this it follows by 
repeated differentiation that f^(x) — (— — x). So in particular/^ (1) = 
(— l) k fW(0). Now if k < n or k > 2n,/®(0) = 0. While if n < k < 2 n we see 
from (7.2.9) that 

n—k 

f m (x) = Y^(n+ s)(n + s - 1) • • • (n + s - k + l)x n+s ~ k 
n\ 

s = 0 

and so 

fW {0)= C JzJLk\ = (-l) k -"( H W-l)...(«+l)i 
ni \2 n — k) 

which is an integer. So we conclude that/®(0) (and hence /®(1)) is an integer 
for all nonnegative integer values of k. 

Step 2: Curiouser and Curiouser 

Remember that q appears as the denominator of our conjectured rational 
number representation of n 2 and also don’t forget our function / from (7.2.7). 
Now introduce a new function 

g(x) = q"(7T 2n f(x) - 7 x 2n ~ 2 f {2 \x) + nr 2 "- 4 / (4) M + (-1 ) n f (2n \x)). 

(7.2.10) 

If we use the facts that/W(0) and/l fc l(l) are integers and jt 2n — you can 
deduce that^(O) andg(l) are also both integers. 

Differentiate (7.2.10) twice and rearrange to find that 

g"(x) = —jt 2 g(x) + q n 7T 2n+2 f{x). (7.2.11) 

Step 3: The Coup de Grace 

We introduce yet a third mysterious function: 

h(x) — g'{x) sin(jrx) — ng(x) cos{ttx). 

Differentiation using the product rule and substitution from (7.2.11) yields 

h'(x) — n 2 p n f(x) sin(jrx). (7.2.12) 

Spitting it 2 — 7T x tt and integrating both sides of (7.2.12) we get 

7 rp n I f(x)sin{jtx)dx— — j h'(x)dx 
Jo tt Jo 

= -(h(i) - MO)) 

IX 

= ~[g'( 1) sin (tt) - Ttg(l) cos(tt) - g'( 0) sin(0) 

Tt 

+ 7Tg(0) COS(0)] 

= ^(l)+^(0), 
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which is an integer. So we’ve deduced that m(n) — jtp n L fix) sm{jtx)dx is 
an integer for all natural numbers n. Furthermore, since both f(x) > 0 and 
sin(7rx) > 0 for 0 < x < 1, it follows that m is in fact a natural number. Now 

“ „n 

from the series expansion for eP we know that 2J fr converges and so it follows 

n = 0 
r> n 

from Theorem 6.4.1 that lim^^ Pj = 0. Hence by the definition of the limit 
we can find n sufficiently large so that < 1. Using this fact together with the 
inequality (7.2.8) and the fact that sin(7rx) < 1 for 0 < x < 1, we get for our 
(sufficiently large) value of n that 

f 1 f 1 1 Jtpn 

m(n) = it p n I f{x) sin(jtx)dx < it p n I —dx — < 1 

Jo Jo n\ n! 

and that gives the required contradiction. □ 

The fact that it is transcendental was proved by Ferdinand von Lindemann 
(1852-1939) 5 in 1882. 6 As we’ve mentioned transcendental numbers again, it’s 
worth making a diversion to say something more about these. At the International 
Congress of Mathematicians that took place in Paris in 1900 the leading German 
mathematician David Hilbert (1862-1943) presented ten key problems for 
research in the twentieth century. He later published an article which included 
a further thirteen and the entire collection of twenty three have since become 
known as ‘Hilbert problems’. 7 For his seventh problem, Hilbert asked if aP is 
always transcendental if a is algebraic (but not equal to 0 or 1) and f> is both 
algebraic and irrational. This was proved to be correct in 1934 by A.O.Gelfond 
(1906-68). 8 So for example all numbers of the form p are transcendental where 
p is prime and N is a natural number that is not a perfect square. A corollary of 
Gelfond’s result (using complex numbers - see Chapter 8, section 3) is that e n is 
also transcendental. However the status of the numbers n e , e e or n 71 (i.e. whether 
they are algebraic or transcendental) remains an open problem. 


7.3 The Number y 


The number y is less well known than its famous cousins n and e but that doesn’t 
make it any the less interesting. In this short section we’ll briefly indicate where it 


5 See http://en.wikipedia.org/wiki/Ferdinand_von_Lindemann 

6 For those intrepid readers who want to see proofs that both e and n are transcendental, 
see e.g. pages 214-5 and 217-8 (respectively) of J.K. Truss Foundations ot Mathematical 
Analysis, Oxford University Press ( 1 997) - but be aware that this is a graduate level text. 

7 See http://en.wikipedia.org/wiki/Hilbert's_problems 

8 See http://en.wikipedia.org/wiki/Aleksandr_Gelfond. The result was also proved 
independently in 1935 by Theodor Schneider (1911-88) (http://en.wikipedia.org/wiki/ 
Theodor_Schneider) in his PhD thesis. 


123 




7 WONDERFUL NUMBERS - e, 7T AND y 

oo 

comes from. We’ll start with the divergent series ^ Integrals are continuous 

n= 1 

versions of sums so a close relative to this series is the divergent integral \dx. 
Leonhard Euler had the brilliant idea of investigating the sequence (b N ) whose 

n 

Nth term is 1 — f ] Irfx. But \dx = log(N) so 

n= 1 


N i 

b N = J2~- log (N). 

‘ * LI 


Theorem 7.3.1. The sequence ( b N ) converges. 
Proof. Since 


-I JV 1 nyi- 1-1 


-dx 

x 


and 


we have 


A i l A l l A r +1 i J 

/ — — — ~h / — — — T / / — dx , 

' n N n N ' J n n 


K = 


yj r +i / l _ 1 

'Jn \« * 


“ n+1 /I 1\ 1 

) dx H . 

x/ N 


Now whenever n < x < « + 1, 


and so we have 


1 1 x — n 1 

0 < = < — . . . (i) 

n x nx n L 


N~ 1 l 

o<b N = £«„ + -, 


where a„ = f" +l Q - ±) dx. 

N—l N—l oo 

It follows from (i) that a n < By the comparison test a n converges 

n—l n—l n—l 

and since linq^^ T = 0, we deduce by algebra of limits that the sequence (b N ) 
converges. □ 
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We write y = limj^^ b N . The decimal expansion of y begins 0.577215664 . . . 
and it is sometimes called the Euler-Mascheroni constant. 9 It is an unsolved 
problem to determine whether y is rational or irrational. 

y is a truly amazing number and to find out why, I recommend the superb 
book by Julian Havill ‘Gamma: Exploring Euler’s Constant’ (see also the Further 
Reading section). Here you will learn about the fascinating relationship between 
the number y and the Gamma function that is defined by r(y) = / 0 °° x>'~ 1 e~ x dx. 
Using integration by parts, it’s not difficult to show that V(n + 1) = n\ for all 
natural numbers n, so T can be thought of as a generalisation of the factorial 
to more general real numbers. It’s turns out that T is differentiable and if V'(y) 
denotes the value of its derivative at the point y then 

y — — r'(i), 

which is truly remarkable! 


9 The Italian mathematician Lorenzo Mascheroni (1750-1800) accurately calculated the 
first nineteen numbers in the decimal expansion of y. 
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Infinite Products 


However the man got a little bit excited; he wanted to prove 
himself some more. "Multiplicagao!” he said. 

Surely You’re Joking, Mr Feynman, R.P. Feynman 


8.1 Convergence of Infinite Products 


J ust as we considered infinite sums or series of numbers, we can also deal with 
infinite products of numbers as limits. To be precise, suppose that we are given 
a sequence of real numbers ( a n ). We can consider the associated sequence of 
partial products (p n ) where p n — cqa, • • ■ a n and enquire whether this sequence 
converges to a limit. Just as we used the sigma notation for sums (the Greek letter 
S), we use the Greek letter If for P when we deal with products. So we write 

n 

Pn = n 
1=1 

and if p = lim^^ p n exists then we write 

oo 

P= Yl a i' 

i= 1 

oo n 

so that M a { = lim M a { . Of course for the limit to exist we require as usual 

A A n— >-oo A A 

i= 1 i= 1 

that for any e > 0 there exists a natural number N so that \p n — p\ < e whenever 
n > N. 

Here are some rather easy assertions that you may want to try to prove for 
yourself: 

(a) If a„ = 0 for some value of n thenp m = 0 for all m > n and p = 0. 
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(b) If a n = c for all n, where c is a non-zero constant then (p n ) diverges if c < —1 
and c > 1, converges to 1 when c = 1 and to 0 when |c| < 1. 

(c) If 0 < a n < 1 and ( a n ) is monotonic decreasing then ( p n ) converges to 0. 
(Hint: Note that p n < a'f] 

So for example as a consequence of (c), you can easily deduce that [~[^i 
converges to 0 and hence so does n^Li ^ = 1 x flnli ^+T- 

The mathematics literature contains far less about infinite products than it 
does about infinite sums. This may be because (to a large extent) the study of 
convergence of products can be reduced to that of sums. For example if each 
a n > 0 and you know about logarithms, 1 it can be shown that 


log 



OO 

X! lo g (a ;)’ 

i= 1 


(8.1.1) 


and so 


OO / OO 

n«; = exp I ^log(a ; ) 

;= i \i=i 

I’m not really assuming much knowledge of logarithms in this book and we 
won’t use them again in this chapter. The next inequality gives a nice link between 
convergence of certain infinite series and that of related infinite products. However 
the proof uses some facts about continuous functions that aren’t dealt with in this 
book. Rest assured that you can skip the proof if you want to. We won’t use the 
result anywhere else in the book. 


Theorem 8.1.1. If (a n ) is such that each a n — 1 + a n where a n > 0, then 
converges if and only if ri/Si a i converges and in either case we have 


OO 


E 


£>, ^F[ a ‘ - exp (!>*)• 


i=i i=i \ ;=i / 

Proof. We’ll begin by proving the inequality for finite sums and products, i.e. 
when oo is replaced by n throughout. First consider the left-hand inequality. It is 


1 Here I'm using the notation log’ to mean iog e ' i.e. the logarithm to base e, so that 
e lo sM = log(e*) for all x > 0. Note that some textbooks denote log by In. 
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8 INFINITE PRODUCTS 


fairly easy to verify this as 

n 

n<i + a { ) = (1 + Q' 1 )(l + a 2 ) • • • (1 + a n ) 


i= 1 


E®*- 


as a i is just one of the terms that we get when we expand out the brackets on 

;= i 

the right-hand side. 

For the right-hand inequality, by (7.1.6) e 01 ' >1 + 01 , for each i and so 

]“[(1 + a,) < e ai e" 2 • • • e“" = exp I «/ ) • 
i= 1 \i= 1 / 

So we have shown that for all natural numbers n 


X! - n fl < - ex p ( X! a i ) • 


(8.1.2) 


1=1 1=1 


V 1=1 


Now suppose that a i converges then the sequence whose nth term is 

i=i 

exp converges to exp a t \ . To verify this you need the fact that the 

function x -* e x is continuous which enables us to legitimately conclude that 


ii™ ex p E a > ) =ex p E a > • 


, i = i 


v i=l 


The study of continuous functions lies beyond the scope of this book - but it can 
be found in any standard introductory text on analysis (see Chapter 12 for a brief 
taster). It then follows from (8. 1.2), 2 that for all n 

n ( oo \ 

< ex P ( E a i ) ’ 

i=i \ i=i / 

and the convergence of the infinite product may now be deduced by using 
Theorem 5.2.1. The proof of the rest of the theorem is left as an exercise for 
the reader. □ 

When we studied infinite series we had the interesting result (Theorem 6.4.1) 

OO 

that if J2 a n converges then lim,,^^ a n — 0. There is an analogous result for 

n= 1 

infinite products and here it is: 


You also need the tact that e* <e y itx < y. 
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8.1 CONVERGENCE OF INFINITE PRODUCTS 


Theorem 8.1.2. Let (a n ) be a sequence with each a n > 0. If J~[j=i a , converges to 
a limit l and l ^ 0 then ( a n ) converges and lim,,^^ a n = 1. 


Proof. This result can be deduced very quickly from (8. 1. 1) and Theorem 6.4.1 
or (in the case where each a { > 1) from Theorems 8.1.1 and 6.4.1. We’ll leave the 
details of this to the reader. Below I’ll give a ‘stand-alone’ proof that requires no 
knowledge of logarithms or the constraint that a ; > 1. 

By definition of a limit, we know that given any e > 0 there exists a natural 
number N such that if n > N then |n" = i — l\ <6, i.e. 

n 

l — € <Y\ a i < l + €. 

i=l 


Now using (L5) we have that 
/ — e 


l + € 


< a 


n + 1 


1 + 6 

n;u« ; /-€- 


2e 2e 

i.e. 1 - — — < a n+l < 1 + 


l + e 1-6 

By making use of (L4) and (L5) you can check that — py < — yyy and so 


In other words 


2e 2 6 


26 

K+1 - 1| < fZTf' 


Now remember this holds for all n > N and observe that f- can be made 

l—€ 

arbitrarily small. Indeed pick S > 0 to be as small as you like. Then to get 
is equivalent to ensuring that 6 < fh which you are at perfect liberty to do. The 
result then follows. □ 


There is a beautiful way of obtaining tt as an infinite product which was first 
discovered by the English mathematician John Wallis (1616-1703) and published 
in 1655 in his Arithmetica Infinitorum , 3 His formula is usually written: 

7 r 2 2 4 2 6 2 • • • 

2 ~ 3 2 5 2 7 2 • • • ’ 

but in the light of the work we’ve done in this chapter, we can write 

jt _ t— r (2n) 2 

2 ~ (2 n + l) 2 ' 

To see why this result is true you’ll need to master the calculus. 


3 See e.g. http://en.wikipedia.org/wiki/John_Wallis 
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8 INFINITE PRODUCTS 


8.2 Infinite Products and Prime Numbers 


In this section we will aim to understand a beautiful formula which links infinite 
series of negative powers of natural numbers with infinite products involving 
prime numbers. This result was discovered by Leonhard Euler whose work we 
have already encountered several times in this book. Not only does it constitute a 
piece of wonderful mathematics in itself, but it was also the launching point for 
some marvellous work by Bernard Riemann (1826-66) which led to what is now 
called the Riemann zeta function. 

OO 

Consider the infinite series V l where r > 1. In honour of Riemann’ s later 

n r 

n= 1 


discovery well write 


n= 1 


i 

n r ’ 


where f is the Greek letter ‘zeta’ which plays the same role as the English ‘z’ and 
is pronounced ‘zeetah’. 

Euler’s great result is: 


Theorem 8.2.1. For all r > 1, 



n= 1 




(8.2.3) 


Note that here fT means the product over all the prime numbers so 



2 r 3 r 5 r 7 r 1 l r • • • 

~~ (2 r - l)(3 r - l)(5 r - l)(7 r - l)(ll r - 1) • • • ' 
We’ll give two outline proofs of Theorem 8.2.3: 


Proof 1. For each prime number p, < 1 , and so by a binomial series expansion 


1 r — 1 H — : f — vH — ft + 


P r P 


,2 r 


r,3r 


(8.2.4) 


Now fix a natural number N and consider rip<w ( 1 — yj ■ When we collect 
together all the terms obtained when all the different series of the form (8.2.4) are 
multiplied together, we obtain an infinite sum of all possible terms of the form 
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, n where p k ,p 2 , . . . ,p k are prime numbers less thanN and n 1 , n 2 , ■ . ■ ,n k 

Pi Pi " P k 

are natural numbers. It follows by Theorem 1.2.1 that 


n 

p<N 



1 

n r ’ 


where the sum is over all natural numbers whose prime factors are no larger than 
N . Now let N — > oo and the result follows. 4 □ 


Proof 2. By definition 


fW = 1 + F 

Then 

( 1 -^) f(r) = f(r) -^ :f(r) 



We next compute 




Now let’s take stock. Multiplying £(r) by (l — f ) removed all powers of |. 
A further multiplication by (l — removed all powers of |. If we continue 

multiplying up to some large prime p then ^1 — ^ • (l — ^)(l — j,) r](r) — 1 

is a sum of terms of the form f where n has no prime factor p or smaller. Now 
let p — > oo and deduce that in the limit this sum converges to zero. The result 
follows. 5 □ 


41 You can and should question whether this argument really deserves to be called a proof. 
5 This isn’t precise enough - can you make it so? 
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8 INFINITE PRODUCTS 


As a direct consequence of Theorem 8.2.3 we may write 


If we expand out the product on the right-hand side (and just for once, let’s not 
worry too much about rigour) we get 


CM " 1 = 

ft *■ 


■ E 

Pl <p2 


(P1P2Y 


E 


Pl <p2 <p3 


(.PiPiPiY 


Notice that the right-hand side is a sum of reciprocals of integers, written in 
terms of their prime factorisation, in which no square appears (as every prime 
factor appears only once). In fact we can write this succinctly as follows: 


CM -1 


H(n) 

n r ’ 

n= 1 


where /x is the Mobitis function, 6 which is defined as follows: 


M(«) = 


0 if n fails to be square-free, 

1 if n = 1 or is the product of an even number of distinct primes. 
— 1 if n is the product of an odd number of distinct primes. 


The Mobiiis function plays an important role in number theory but we will 
give no more than this brief introduction to it here. 

The next two results use only finite (rather than infinite) products but this seems 
like an ideal opportunity to include them. First let us return to Chapter 1. Recall 
Theorem 1.2.2 where we gave the classic proof by Euclid that there are infinitely 
many primes. More than 2000 years later, Leonhard Euler gave a different and 
very elegant proof that employs infinite series. 


Theorem 8.2.2 (Theorem 1.2.2 revisited). There are an infinite number of prime 
numbers. 


Proof. We again use proof by contradiction and assume that there are only 
a finite number of primes, N, say, and we write them in order as p lt p 2 , . . . , p N 
(of course p 1 = 2 and p 0 — 3 but the notation is convenient). We will use the 
fact (which we proved in Theorem 1.2.1) that every natural number n has a prime 
factorisation n = p^p™ 2 • • • p n f N (where some of the m ; s maybe zero). Now recall 
the sequence ( s n ) of partial sums of the harmonic series: s„ = 1 + | + • — and 
for now let us fix a value of n. Let m = max{mj, m 2 , . . . , m N } and consider the 


6 Named after the German mathematician Augustus Ferdinand Mobius (1 790-1 868). 
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8.2 INFINITE PRODUCTS AND PRIME NUMBERS 


product 


1 1 

1+ 2 + ^ 

1 


1 


1 1 


^ l|1+ 3 + ^ 


1 


1 


1 H 1 — s — h • • • H — — 


1 

Pn Pn Pn, 

m 

which we can write succinctly as ]~ [ (=1 . The key observation is that 

k = 0 p i 


S 


n 


< 


N m 


nE 


i 



since every natural number whose reciprocal appears in the sum on the left-hand 
side has a prime factorisation whose reciprocal appears on the right-hand side. 
But for each 1 < i < N, using the formula for the sum of a geometric series, 
we have 


1 00 1 1 

E^Ei = A 


and so we conclude that 


<11 — 

" 1 1 1 - p. 


Pi 

1 ~Pi' 


Now the term on the right-hand side is completely independent of the choice 
of n and so we conclude that the sequence (s n ) is bounded above. But it is also 
monotonic increasing and so it has a limit. This contradicts the known fact that 
the harmonic series diverges and enables us to conclude that there cannot be a 
finite number of prime numbers. □ 


To close this section, we’ll prove a result that uses both the properties of the 
exponential function that were developed in the last chapter, and the notation for 

OO 

finite products that we’ve seen in this one. We’ve already seen that J2 “ diverges 

i=i 


OO 



course every prime number is square-free and you may have been wondering 


OO 

about \ where the sum is over all prime numbers. The question was (again) 

p= i P 

settled by Leonhard Euler. 7 


7 The proof given here is due to Ivan Niven The American Mathematical Monthly Vol. 78 , 
No. 2 pp. 272-3 (1971 )and I found it on pages 75-6 of 'Euler The Master of Us AH' by William 
Dunham, Math. Assoc, of America (1 999). 
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8 INFINITE PRODUCTS 


oo 

Theorem 8.2.3. ^ 1 diverges. 

p=i p 


Proof. We’ll give a proof by contradiction. Suppose that ff, 7 , converges to a 

P i P 

real number L. Fix a natural number n and let q be the largest prime number that 
doesn’t exceed n. Now since L is the supremum of the sequence of partial sums 
we have (using (ii) just after (7.1.6)) that 

l 1 1 1 1] 

‘ > “ p h + ; + 5 + ‘ + il 

= n« f 

2<p<q 

^ n K) 

7<1Kn ' * ' 



where the sum is over all square-free integers less than or equal to n. This inequality 
is correct since the square-free integers are precisely those that have no squares 
(or higher powers) of primes in their prime factorisation and we get the reciprocal 
of all those that don’t exceed n (and also some that do) when we multiply 
out the bracket. So we’ve shown that e L is an upper bound for the monotonic 
increasing sequence whose nth term is ^ an d so this sequence converges by 

V ' •• hf 

OO 

Theorem 5.2. 1 (i) and that contradicts Theorem 6.5.2. Hence we deduce that ^ ' 

P= i P 

diverges. □ 


8.3 Diversion - Complex Numbers and the Riemann Hypothesis 


One of the greatest achievements of Bernard Riemann was to extend £(r) so 
that the real number r is replaced by a complex number. Complex numbers first 
appeared in the work of Rafael Bombelli (1526-72), 8 who instead of treating 

8 Seee.g. http://en.wikipedia.org/wiki/Rafael_Bombelli 
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b 


' ^ = 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 

1 

-1 

1 

1 a 

i 


Figure 8.1 . Complex numbers in the ‘Argand diagram'. 9 


x 1 = — 1 as an equation that couldn’t be solved, decided to pretend that it could. 
As this is a quadratic equation, it must have two roots and since these cannot 
be real numbers, let’s call them i and —i. So i is a new type of number that has the 
property that i 2 = —1. In fact we can give i a nice geometric interpretation in the 
dynamic spirit of Section 1.3 where we viewed negative numbers as reflections 
through an imaginary mirror on they-axis. We should again think of the real line 
as sitting inside an infinite two-dimensional plane as in Figure 1.3. Recall that a 
reflection is the same thing as a rotation (and for convenience, we’ll take this to 
be anti-clockwise) through 180 degrees. Now let’s suppose that we instead rotate 
(again anti-clockwise) through 90 degrees. We can identify this rotation with 
the new number i, so i takes the point with co-ordinates (1 , 0) on the x-axis to the 
point with co-ordinates (0, 1) on the y-axis. If we then make another 90 degree 
rotation, we will have rotated 180 degrees in total and reach the number (—1,0). 
So i 2 takes us from (1, 0) to (—1, 0), i.e. i 2 = — 1. 

When we think in this way, any point in the plane which has co-ordinates (a, b) 
(where a and b are real numbers) is re-interpreted as a complex number a + ib 
(see Figure 8.1). Such numbers can be added by the rule: 

{a -}- ib ) — f- (rr — id ) = (u -f- c) - 1- i(b -|- d), 

and also multiplied together by expanding brackets in the usual way, remembering 
that i 2 = — 1 so 

(i a + ib)(c + id) — ab + ibc + iad + i 2 bd 
= ( ac — bd) + i(bc + ad). 


9 The representation of complex numbers as points in two-dimensional space is often 
called an Argand diagam after Jean-Robert Argand (l 768-1 822). 
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We can even discuss convergence of sequences of complex numbers, so if (z„) is 
such a sequence where each z„ — a n + ib n we say that (z„) converges whenever 
both of the real-valued sequences (a n ) and ( b n ) do. Indeed in this case, if (a„) 
converges to a and (b n ) converges to b then (z„) converges to a + ib, e.g. 
lim„^oc {^ + f(l — ^7)} This also enables us to develop a theory of infinite 

OO 

series z n where each z n is a complex number. All of this is much more 

n= 1 

than just a delightful game. Complex numbers have a rich and powerful theory 
that sometimes throws great light on facts about real numbers. 10 Indeed Jaques 
Hadamard who we met earlier through his work on the prime number theorem 
wrote that ‘The shortest path between two truths in the real domain sometimes 
passes through the complex domain’. As an example of the unifying power of 
complex numbers we’ll mention the fundamental theorem of algebra which states 
that any algebraic equation of the form 

a 0 + a x x + a 2 x 2 H + a n x n = 0, 

where a 0 , a x , a 2 , . . . , a n are real numbers has n solutions, once complex numbers 
are allowed to come in and play. Complex numbers also play a fundamental role 
in many applications of mathematics to science and engineering, indeed they are a 
vital tool in the quantum theory which underlies our understanding of molecules, 
atoms and the microscopic realms of elementary particles. 

Now let’s return to Riemann and the zeta function. He considered the function 

00 1 

= ( 8 - 3 - 5 ) 
n= 1 

where z — x + iy. This series only converges when x > 1 but Riemann realised 
that it makes sense to ‘analytically continue’ the function f into the complex plane 
for the case where x < 1. We should be clear that £ looks different for x < I than 
it does for x > 1 - indeed if we naively substitute x = — 1 andy = 0 into (8.3.5) 

OO 

we get the divergent series ^ n. There is something much more subtle going on 

n= 1 

here. When thinking of t, for jc < 1 it is better to try to imagine a completely 
different function (one for which there is no clean formula) which naturally flows 
into (8.3.5) as we pass the x = 1 threshold. 

Riemann focussed attention on the equation 

C(z) = 0. 

It turns out that this equation has an infinite number of solutions that lie in 
the so-called critical strip, 0 < x < 1. The celebrated Riemann hypothesis is that 
all of these are such that x = \. This problem has remained unsolved for over 
150 years and may be the most important mathematical problem of all time. Its 

10 Any real numberx can be identified with the complex numberx + / 0. 
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solution will lead to important information about prime numbers. Let’s be a little 
more precise about this. Recall the prime number theorem that was discussed at 
the end of Section 4.1. It tells us that lim^^ = 1 where i r(n) is the 

number of prime numbers which are less than or equal to n. In fact an even better 
approximation to 7t (n) can be found by using the logarithmic integral defined by 
li(x) = f* log 1 ( y ) dy for x > 2, and we have the stronger form of the prime number 

theorem: lim,,^^ = 1. Now iftheRiemann hypothesis is valid then we would 

get very precise information about how good an approximation li(n) is to 7r(n). In 
fact it can be shown that if the Riemann hypothesis is true then for all n > 2657, 11 

|jr(n) - li(n)| < — Vnlog e (n). (8.3.6) 

oTC 

If we divide both sides of (8.3.6) by li(n) we get 

x(n) x < 1 Vn\og e {n) 

li (n) ~ 8 jt li (n) 

and since lim,,^,^ = 0, we see that (8.3.6) yields an ‘error estimate’ giving 

information on how close an approximation li(«) is to n («) for very large n. 

At the time of writing, it has been shown that the first 100 billion zeroes 
of £ do indeed satisfy x — \ but of course, the conjecture may still be false. 
There are a number of popular accounts of the Riemann hypothesis, see e.g. 
Prime Obsession by John Derbyshire (Penguin 2004), Dr Riemann’s Zeros by Karl 
Sabbagh (Atlantic Books 2002) and The Music of the Primes by Marcus de Sautoy 
(Fourth Estate 2003). The solution of the Riemann hypothesis featured within the 
eighth of Hilbert’s 23 problems that were mentioned at the end of Section 7.2. One 
hundred years later it remained as one of seven unsolved ‘Millennium problems’ 12 
that were announced by the Clay Mathematics Institute in May 2000. 


1 1 This result was established by the American mathematician Lowell Schoenfeld (1920- 
2002) in 1976. 

12 See http://www.claymath.org/millennium/ 
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Continued Fractions 


Continued fractions are part of the "lost mathematics", the mathematics now 
considered too advanced for high school and too elementary for college. 

A History ofn, P. Beckmann 


I n this chapter we’ll take a brief look at an alternative and attractive way of 
representing real numbers. First we’ll need a preliminary result that is of great 
interest in its own right. 


9.1 Euclid’s Algorithm 


Recall that the highest common factor (hcf) of two natural numbers is the largest 
natural number that divides them both and leaves no remainder. It is also called 
the greatest common divisor (gcd). We’ll write hcf(x, y) for the highest common 
factor of x and y, so e.g. hcf(3, 9) = 3, hcf(24, 108) = 12 and hcf(4, 7) = 1. 
Numbers such as 4 and 7 whose highest common factor is 1 are said to be 
coprime. Note that being coprime is a relative property of a pair of numbers 
and is nothing to do with either of the numbers being prime - e.g. 8 and 9 are 
coprime though neither is prime. Another useful bit of notation is x\y which 
means that x divides into y and leaves no remainder - so we have for example, 
4|96, 7|245 andl9|171. 

Euclid gave a practical algorithm for finding hcf(x, y) and we will now describe 
how this works. Suppose that x < y and divide x into y to get 

y — cx + d, 

where c is the unique natural number for which cx < y < (c + l)x (from which 
it follows that d < x ) . If d = 0 we are finished as y is then a multiple of x and so 
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hcf(x, y) — x. So from nowon we’ll assume that d ^ 0. Now notice that any factor 
of both x andy is also a factor of d since d = y — cx. Also any factor of both x and 
d is also a factor of y. It follows that hcf(y, x) — hcf(x, d) and we might as well 
replace y and x by the smaller pair of numbers x and d. From now on we’ll write 
c as Cj and d as d 1 as we are going to iterate the above process. So since d 1 < x we 
can divide d 1 into x to get 

x — c 2 dy "F d 2 . 

Arguing above we have hcf(x, dj) =hcf(d 1 , d 2 ) and so we can replace x and d l 
by d l and d 2 . Continuing in this manner we generate a decreasing sequence of 
natural numbers d 1 , d 2 , d 3 , ... and since these are whole numbers the sequence 
must terminate at some point N, i.e. 

d N -i — c N+1 a N . (i) 

Going back one step we have 

d N _ 2 = c N d N _ 1 + d N , (ii) 

so any factor of both d N _ 2 and d N _ x is a factor of d N . Arguing backwards we 
thus deduce that any factor of both x and y is a factor of d N . But d N is a factor of 
d N _ i by (i) and hence by (ii) is also a factor of d N _ 2 . Again arguing backwards we 
deduce that d N is a factor of both x andy. It follows that d N =hcf(x, y). 

An example will help us see what’s going on. Let y = 93 and x — 36. We write 

93 = (2 x 36) + 21 
36 = (1 x 21) + 15 
21 = (1 x 15) + 6 
15 = (2 x 6) + 3 
6 = 2x3 

So hcf(93, 36) = 3. In this case we have N — 4, c l — 2, c 2 = 1, c 3 = 1, c 4 = 
2, c 5 = 2 ,d l — 21, d 2 = 15, d 3 = 6 and d i = 3. 


9.2 Rational and Irrational Numbers as Continued Fractions 


We’ll carry on with the example we’ve just presented and rewrite it in fractional 
form: 


93 

21 

■ • • (i) 

— 

= 2+ — 

36 

36 

36 

15 

■ • • (h) 


— 1 H 

21 

21 
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21 

15 

15 

~6 

6 

3 



. . . (iii) 

■ ■ • (iv) 

■ • • (v) 


Now substitute (ii) into (i) to get 


93 

— = 2 - 
36 


1 


15 ' 
1 + 21 


Substituting from (iii) we obtain 


93 

36 


= 2 + 


1 


6 

1 + L5 


Now (iv) allows us to change this to 


93 

36 


= 2 + 


1 


1 + 


1 


1 + 


1 


2 + 


3 

6 


and finally we incorporate (v) to conclude that 


93 

36 


= 2 + 


1 


1 + 


1 


1 + 


1 


2 + 


1 

2 


This is the continued fraction representation of the number ||. We are going 
to investigate which other numbers we can write in this form so let’s introduce 
some notation. Let ( q n ) be a sequence of nonnegative integers. We’ll call 


f = % 


% 


h 


% + 


<?4' 


% H 


(9.2.1) 
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a regular continued fraction. We’ll say the fraction/ terminates if q N+l — 0 for 
some N. In this case/ is perfectly well defined. If/ doesn’t terminate then it is not 
clear at this stage what meaning can be given to (9.2.1) and we’ll come back to this 
point. You may guess that it will have something to do with limits. (9.2.1) is a pain 
to write out so we’ll need some alternative notation. As a simplified representation 
of (9.2.1) we will write 


/ — [q 0 , <Ji, <? 2 ’ ?3’ *?4’ • • •]. 


and if/ terminates at N > 2 then/ = [q 0 , q 1 , q 2 , . . . , q N ]. For example if we take 
f — || then N = 4and f — [2, 1, 1, 2, 2], Note that, in general, we have the useful 
identity: 


lq 0 , q 4 , q 2 , q 2 , q 4 . • • -1 — + 


1 


[q 1 ’ ^ 2 ’ *? 3 ’ <? 4 > • • •] 


— [*?0 ’ [?1> ?2’ "?3’ ^4’ • ' -11- 


Most of our work in this short chapter will be concerned with the regular case 
but I will also want to show you some examples of irregular continued fractions. 
For these the numerators in the continued fractions may differ from 1 so, as well 
as the sequence ( q n ), we have a sequence (p„), and we are interested in numbers 
that have a representation: 


f — % + 


<Zi 


Pi 


qi 


pi 


<?3 + 


Pi 


q 4 ■ 


Pi 


% H 


(9.2.2) 


Unless stated otherwise, all continuous fractions from now on will be regular. 


Theorem 9.2.1. Every positive rational number can be represented by a termi- 
nating continuous fraction. 


Proof. Consider the rational number/ = | with a > b. We can apply Euclid’s 
algorithm (as we did for f — ||) to see that / = [q 0 , q l , q 2 , . . . , q N ] with q N — 
hcf(«, b). On the other hand if a < b we use 

a 1 

b = I’ 

a 

to see that if \ = [q 0 , q v q 2 , . . . , q N ] then f = [0, q 0 , q v q 2 , . . . , q N ], □ 


Theorem 9.2.2. There is no positive irrational number which can be represented 
by a terminating continuous fraction. 
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9 CONTINUED FRACTIONS 


Proof. Let a be a positive irrational number. Then we can certainly write q 0 < 
a < q 0 +l, whereq 0 is a nonnegative integer. Write a = q 0 + Pi where 0 < p l < 1. 
So a — q 0 + where > 1. Furthermore is irrational for if it is rational, then 

so is a and that’s a contradiction. Write f l =q l + p 2 where q l is a nonnegative 
integer. We then have 


1 

“ = % H ; = % + 

<3i + Pi 


1 


<h + 


1 ' 

Ji 


Now fi 2 is irrational and we can continue as above to get a. — [q 0 , q 1 . q 2 , q 3 . q 4 , . . .]. 
Now suppose this continued fraction terminates. Then q N+1 — 0 for some N and 
so fi N = q N . But f N is irrational and q N is an integer and we have our desired 
contradiction. □ 


In fact it can be shown that a positive irrational number can be written as the 
limit of the sequence of ‘convergents’ or to be precise: 

« = J im 

N— >-oo 

This is the sense in which every positive irrational number can be represented as 
a ‘non-terminating’ continuous fraction. We will not give a proof of this fact here 
as the argument is quite lengthy. Instead we’ll look at two delightful examples: 


Example 9.1: The Square Root of Two 

We know that 1 < y/2 < 2 so we can write ~Jl = 1 + — , then 


1 


\/2 + 1 


= >/2+ 1. 


1 V2-1 (V2-1)(V2+1) 

Since 2 < ~/2 +1 < 3, we can write a 1 = 2 + d- so at this stage we have 


>/2 = 1 ■ 


1 


1 ' 

2 + — 

a 0 


But 


1 1 r 

a 2 — = — ■= = V2 + 1. 

oil 2 a / 2-1 

So we can write a 2 = 2 + ^ and the discerning reader will have noticed the 
emergence of a pattern. Indeed a — 2 H for all n and so it follows that 

a n+\ 

V2 = [1,2, 2, 2, ... ,2, .. .], 

i.e. q 0 = 1 and q n — 2 for all n > 1. 


142 



9.2 RATIONAL AND IRRATIONAL NUMBERS AS CONTINUED FRACTIONS 


Example 9.2: The Golden Section 

We consider the golden section </> = v ^ 2 +1 and recall from Chapter 4, section 4 
that 4 = v ^~ 1 . We know that 1 < 4> < 2 and so we can write 4> = ^ + \ = l + ^- 
So^j- = Af — \ — Hence oq = cp and so 

1 1 

</>=l + — = 1H r 

<t> i , i 


= 1 


1 


1 

1 + 0 


Hence we see that 


</> = [ 1 , 1 , 1 , 1 , 1 , ...], 
i.e. q n — 1 for all n. Isn’t that beautiful! 

Both the examples of continued fractions that we’ve just looked at have a 
predictable pattern (at least after q 0 in the first case). We say that a continued 
fraction expansion is periodic if there exists N such that all numbers after q N 
are just repetitions of the list q N+1 , <j N+2 , ■ ■ ‘iN+k f° r some natural number 

k. For example you can check by using the methods employed above that 

V3 = [1, 1, 2, 1, 2, 1, 2, 1, 2, ... , 1, 2], so in this case N = 0, k — 2 and the 
repeating pattern is 1,2. The French mathematician Joseph-Louis Lagrange 1 
(1736-1813) showed that a continued fraction is periodic if and only if it has 
the form a+ f^ where a and c are integers and b is a natural number that is not a 
perfect square. 

When we come to irrational numbers that are not given in terms of square 
roots, we cannot expect periodic behaviour and indeed the following results are 
known and are stated here without proof: 

7 r = [3,7, 15, 1,292, 1, 1, 1,2, 1,3,...] 
e = [2, 1, 2, 1, 1, 4, 1, 1, 6, 1, 1, 8, 1, 1, 10, . . .] 

y = [0, 1, 1, 2, 1, 2, 1, 4, 3, 13, 5, 1,1,.. .] 

Of these three, the one for e has the nicest pattern. Bear in mind that the 
expansion for y will terminate if it turns out that this number is rational. 


1 See e.g. http://en.wikipedia.org/wiki/Joseph_Louis_Lagrange 
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9 CONTINUED FRACTIONS 


We can also obtain some very pleasant representations for it and e by using 
irregular continued fractions: 


c — 2 + 


3 + 


4 + 


4 

- = 1 + 
it 


3 2 


7 2 


9 2 
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How Infinite Can You Get? 


The in Finite has always stirred the emotions of mankind more deeply than 
any other question; the infinite has stimulated and fertilized reason as few 
other ideas have; but also the infinite, more than any other notion, is in 
need of clarification. 

On the infinite, David Hilbert 1 


I t’s time to return to the problem of infinity. In Chapter 2 we made a naive but 
unsatisfactory attempt to define infinity as | . The modern theory of the infinite 
dates back to groundbreaking work by Georg Cantor (1845-1918) during 1874- 
84. 2 He began by trying to figure out what we really mean by counting. Suppose 
that we have a supply of letters and let’s write down six of them: b, g, k, n, p, w. 
How do we know that we really have six symbols? Of course we count them, but 
what does this really mean? Cantor proposed that the essence of counting resides 
in a one-to-one correspondence between natural numbers and the objects we are 
trying to count. So when we count the symbols we are doing something like this: 



k < — »■ 3 


n 


P 


w 


4 

5 

6 


1 This famous essay by one of the great mathematical minds of the early twentieth century, 
which was first published in 1925, is reprinted in From Frege to Godel: A Sourcebook in 
Mathematical Logic ed. J. van Heijenoort, Harvard University Press (1967). 

2 See http://en.wikipedia.org/wiki/Georg_Cantor 
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In this procedure each symbol is mapped to a number between 1 and 6 and no 
symbol is mapped to more than one number. This is the essence of counting. We 
would implement the same process if we were counting cows in a field or pencils 
on my desk. In every case where we count a collection of discrete objects we put 
them in one-to-one correspondence with 3 the natural numbers starting at one. 
The last number we need, after which our collection of objects is exhausted, is 
its size. 

I’ll repeat a key point. The essence of a one-to-one correspondence is that every 
object is related by the arrow to one and only one number. If (for example) we 
used two colours, red and blue, to paint our symbols then we would not have a 
one-to-one correspondence between objects and colours as b, k, n and w might 
all be red while g and p are blue. 

When counting is viewed as a one-to-one correspondence, Cantor used the 
word cardinality to describe it. So the number 6 is the cardinality of the collection 
h, g, k, «, p, w. If I take a smaller selection of these numbers e.g. g,p,w then it 
has a smaller cardinality - which is 3 in this case. 

As long as we stick with finite collections, 4 then we cannot ever set up a 
one-to-one correspondence between a collection of objects and a sub-collection 
comprising some (but not all) of the original collection. When we extend to 
collections of infinite size, one of Cantor’s greatest achievements was to turn this 
on its head. Let’s start with the natural numbers 1, 2, 3, . . . This is an infinite 
collection - indeed we saw way back in Chapter 1 that there is no largest element. 
Cantor defined the cardinality of the collection of all natural numbers to be K 0 . 
Here K (pronounced ‘aleph’) is the first letter in the Hebrew alphabet and the 
subscript 0 should just be accepted for now. Now at the moment K 0 is just a 
symbol and you may be wondering why Cantor doesn’t just write oo? 

Be aware that Cantor is using K 0 as a symbol for infinity in a very special way. 
Let’s go back to the number 3. From Cantor’s point of view this is the property 
that all collections of three objects in the universe share in common. They all 
partake in ‘threeness’. Similarly K 0 should be the property that certain (why not 
all?) collections of infinite objects share in common. So any collection that can 
be put into one-to-one correspondence with the natural numbers will also have 
cardinality K 0 . Now consider the even numbers 2, 4, 6, 8, • • • There are surely 
‘fewer of these’ than there are of the natural numbers as we’ve omitted all the 
odd numbers, however they also have cardinality K 0 since we have the one-to-one 
correspondence: 

1 < — * 2, 2 < — * 4, 3 < — * 6, 4 < — * 8 ..., 

indeed any natural number is mapped uniquely to an even number by the 
formula n * — >■ 2 n. This example demonstrates clearly that when we get to infinite 

3 Technically - a subset of - see next footnote. 

k Mathematicians use the technical term 'set' when they want to discuss collections of 
(mathematical) objects. See Appendix 2. 
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collections it is no longer true that the whole is greater than all of its parts. Indeed 
we’ve just shown that K 0 = ^ and this should be compared with the ‘algebra of 
infinity’ that we discussed rather naively at the end of Chapter 2. 

Now let’s consider the integers. There should be 2K 0 + 1 of these as each natural 
number has a negative partner and we also want to include zero. From what you’ve 
just seen you might guess that 2K 0 + 1 = K 0 and that is indeed the case as we 
have a one-to-one correspondence between natural numbers and integers given 
as follows: 

1 < — * 0, 2 < — * 1, 3 < — * -1, 4 < — ^ 2, 5 « — > -2, 6 < — » 3, 7 < — y -3, . . . 

If you want to express this by a concise formula, we have n < — > | if n is even 
and n < — > — ^ if n is odd. Here natural numbers are on the left-hand side of 
the arrow and integers on the right-hand side. 

Now what about the rational numbers? Surely there are going to be more of 
these than the natural numbers. Cantor proved there wasen’t. To see why let’s 
first of all restrict to the positive rational numbers. We can write these in a list as 
follows: 


12 3 

1 1 1 

13 5 

2 2 2 

12 4 

3 3 3 

13 5 

4 4 4 


4 

1 

7 

2 

5 

3 

7 

4 


5 6 7 

1 1 1 ' ' ’ 

9 n 13 

2 2 2 

7 8 10 

3 3 3 - - ’ 

9 11 13 

4 4 4 


Each number is written in the form - . In the first row we have listed all the 

<Z 

numbers for which q — 1 and these are of course the natural numbers. In the 
second row we have listed all the numbers for which q — 2 and which aren’t 
already in row 1 - so we omit e.g. | as it is already included as the number 
j. Clearly there are an infinite number of rows and each row is infinite. So the 
cardinality of the positive rational numbers is surely Kq = K 0 x K 0 . We now show 
that that is precisely K 0 . To see this we simply count the numbers in the order 
of increasing p + q but where these are the same we count the number with the 
highest p first. So e.g. | and f both have p + q = 5 but since 4 > 3 we count | 
first. We then get the following one-to-one correspondence: 
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You can include negative rational numbers and zero in this scheme by imitating 
the argument that we used above to count the integers. 5 

Ingenious as this argument is we seem to be discovering a predictable 
monotony about the infinite. Natural numbers, even numbers, integers and 
rational numbers all have the same cardinality. We say they are countable as 
they can all be put into a one-to-one correspondence with the natural numbers. 
But there is a twist in the tale. Consider the open interval (0, 1). It cannot be put 
into a one-to-one correspondence with the natural numbers and Cantor called it 
an uncountable set for this reason. How do you prove (0, 1) is uncountable? Write 
each point in the interval as an infinite decimal 0.x l x 2 x 3 ■ ■ ■ Now suppose that 
(0, 1) is countable. Then we have a one-to-one correspondence 


1 

2 

3 

4 

5 



0 . a j a 2 a 3 a 4 a 3 * * * 
0.b 1 b 2 b 3 b 4 b 5 ■ ■ ■ 

0' C 1 C 2 C 3 C 4 C 5 • • • 

0 . d^ d 2 d 2 d 4 d 3 * * * 


It doesn’t matter what specific numbers the as, bs, cs, ds and es represent. They 
are just numbers between 0 and 9. Now if (0, 1) really is countable then every real 
number between 0 and 1 occurs on the right-hand side in this correspondence. 
Cantor used his famous diagonal argument to construct a number between 0 and 
1 which is not on the list. His candidate was the number 0.x 1 x 2 x 3 x 4 x 5 ■ ■ ■ where 
x l ^ a r ,x 2 ^ b 2 , x 3 ^ c 3 , x 4 i=- d 4 , x 5 ^ e 5 etc. Now which number on the list is 
this one? As, x 1 ^ a l it can’t be the first one, as x 2 ^ b 2 , it isn’t the second since 
x 3 ^ c 3 , it won’t be the third and so it goes. As the list is assumed to be complete 
we’ve deduced a contradiction. 

Cantor assigned the letter c to be the cardinality of the interval (0, 1) . The 
letter c stands for ‘continuum’ and what we’ve just seen tells us that our notion 
of the infinite changes when we pass from the discrete to the continuous. K 0 is 
the infinity of the discrete and c is the infinity of the continuous. In fact c is 
also the cardinality of the whole real line because it can be out into one-to-one 
correspondence with (0, 1) through the formula x — >• tan (7 rx — |). Since the real 
line contains the natural numbers as a subcollection we can assert that 


In fact it can be shown that c — 2 K ° and this beautiful formula is embossed 
on Cantor’s memorial in his home town of Halle. For a quick insight as to why 
this is true - write each non-zero real number between 0 and 1 in the form 

5 The scheme presented here is not the only way of enumerating the positive rational 
numbers and you may well meet others when you read different textbooks. 
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0 .a 1 a 2 a 3 ■ ■ ■ where instead of the usual decimal expansion we use binary. So each 
a n is either a 0 or a 1 and we have a one-to-one correspondence between the 
interval (0, 1) and the collection of all binary sequences ( a n ). Now there are two 
ways of choosing a v two ways of choosing a 2 etc. As there are K 0 numbers in each 
sequence the cardinality of the collection of all of these is precisely 2 K ° and so this 
is the cardinality of (0, 1). From what has been written above we see that it is also 
the cardinality of the real line. Incidentally, if you remove all the rational numbers 
from the real line and just let the irrational numbers remain then the cardinality 
of these is also c and so 


So far our notion of counting and its extension to infinite collections has been 
based on the notion of one-to-one correspondence. It doesn’t matter how we 
order the numbers 1, 2 and 3 - if it is 1, 2, 3 or 3, 1, 2 the cardinality of the 
collection is always 3. But this ignores the fact that we count numbers in order 
1, 2, 3, . . . andl < 2 < 3 < • • • Cantor argued that counting numbers in order is a 
different process from matching them up in a one-to-one correspondence. When 
collections of numbers are finite, we don’t notice this but when they are infinite 
the distinction matters. So imagine that we try to count the natural numbers in 
order. Then we get 1, 2, 3, 4, ... , 100, . . . , 1000, . . . , 1000000, . . . , co 0 . Nowa> 0 is 
an infinite number that is the end-product of counting all the natural numbers in 
order starting at the number 1. It is not the same as K 0 which is the cardinality of a 
collection that is put into one-to-one correspondence with the natural numbers. 
Cantor argued that ‘putting into one-to-one correspondence’ and ‘counting in 
order’ are logically distinct processes and each has its own notion of ‘infinity’. 
Cantor called <w 0 an ordinal number. 

When we deal with cardinal numbers, the ‘algebra of infinity’ ensures that 
K 0 + 1 = 1 + K 0 . But this is not the case with ordinal numbers. Once we have 
counted to w 0 , we can start again and go to the next number a> 0 + 1 and then co 0 + 2 
followed by co 0 + 3 etc. The cardinality of the collection of numbers (or ‘set’ if 
we are going to use the precise language of mathematicians) 1,2,3 , . . . , co 0 , co 0 + 
1, co 0 + 2, co 0 + 3 is still K 0 but the highest number we have reached in our 
counting is a> 0 + 3 which is bigger than o> 0 + 2 which is bigger than co 0 + 1 which 
is itself bigger than <u 0 , where ‘bigger’ is understood solely in the sense of ordinal 
numbers. Furthermore 1 + &> 0 ^ a> 0 + 1 as 1 + co 0 is the end-point that we get to 
when we start counting in order from 2 as the starting point, and this is precisely 
a> 0 . So 1 + (o 0 — co 0 which is smaller than a> 0 + 1. 

Now Cantor was able to show that as we carry on counting these ‘transfinite 
ordinals’ eventually we will reach an ordinal number called co 1 which is such 
that the cardinality of the collection 1, 2, 3, ... , o> 0 , . . . , co 1 is no longer K 0 . He 
used the notation K; to denote the cardinality of this set. Beyond co 1 lies a> 2 and 
this leads to a collection with cardinality K 2 and so we see the beginnings of the 
development of an infinite sequence of both transfinite ordinals and associated 
transfinite cardinals. 
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Where does c fit into this story? Cantor’s famous continuum hypothesis 
conjectures that c = K 1; but he was unable to prove this. In 1963, the logician 
Paul Cohen (1934-2007) showed that this conjecture was undecidable in that 
there are valid axiom systems for the foundations of mathematics under which 
it is true and others under which it is false. This doesn’t bother most working 
mathematicians for whom the fine properties of higher orders of infinity are an 
irrelevance. But the fact that K 0 and c are different is of great importance in 
modern mathematics. 

In his famous essay ‘On the infinite’ which starts this chapter, David Hilbert 
makes the distinction between ‘potential infinity’ and ‘actual infinity’. 6 Potential 
infinity has been the theme of much of this book. Mathematical analysis is based 
on the notion of the limit which avoids actual infinity by using only finite processes 
in the mathematical development. However the development of set theory at the 
end of the nineteenth century forced mathematicians to directly use collections of 
infinite objects in their reasoning and Cantor was then able to give ‘actual infinity’ 
a precise meaning. Not all mathematicians agreed with this but we will not go 
into the details of that saga here. 7 Hilbert wrote ‘No-one shall expel us from the 
Paradise that Cantor has created’ and contemporary mathematicians (as I pointed 
out above) are content to let this ‘Paradise’ be, though few venture very far into 
its garden of delights. 


6 In fact this distinction goes back to the Greek thinker Aristotle (384-22 BCE), who in the 
fourth century BCE wrote in his book Physics: The infinite has a potential existence ...there will 
not be an actual infinite'. 

7 Leonard Kronecker, who we met at the beginning of Chapter 2, was a strong opponent 
of Cantor’s ideas. 
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Constructing the Real 
Numbers 


Mathematical analysis is as extensive as nature herselt. 

Joseph Fourier 


A key theme of this book has been the subtlety and beauty of the real number 
system. But what are real numbers anyway? We have seen that the rational 
numbers are inadequate for many of the purposes of mathematics. Indeed they 
cannot fill up the real number line. Irrational numbers can be obtained as limits of 
sequences of rational numbers using decimal expansions (or continued fractions). 
But this isn’t really satisfactory as the notion of limit was itself based on the 
assumption that real numbers exist. The real numbers can be systematically built 
from the rational numbers using either of two approaches. One of these is the idea 
of a Dedekind cut which was introduced by Richard Dedekind (1831-1916) 1 and 
the other is by using Cauchy sequences and was due to Georg Cantor ( 1 848-1 9 1 8). 2 

We’ll briefly sketch each of these. However I stress that this chapter is highly 
incomplete and gives only the skimpiest of introductions to a highly complex 
subject. 


11.1 Dedekind Cuts 

J 


Let’s mark all the rational numbers on the real number line. Of course this leaves 
holes where the irrational numbers need to go. Dedekind’s idea is to use all the 
rational numbers that are smaller than the irrational number that we want to 


1 See http://en.wikipedia.org/wiki/Richard_Dedekind 

2 See http://en.wikipedia.org/wiki/Georg_Cantor 
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construct as a tool to fill the hole. To be more precise - we define a cut c to be a 
collection (or set) of rational numbers that has the following four properties: 

(Ci) c contains at least one rational number. 

(Cii) If p is in c then so is any rational number smaller than p. 

(Ciii) c has no largest element, so if p is in c then there exists a rational number 
q such that q > p and q is in c. 

(Civ) c does not contain all the rational numbers. 

We now define the real number line to be the collection of all possible cuts. 
This is a radical re-interpretation of number. We might ask what \fl is. The 
answer is that it is the cut that contains all rational numbers p for which p 2 < 2. 
So for example 1 is in this cut, so is 1.4 and so is 1.4142136. All the rational 
numbers themselves are naturally associated with cuts, so the rational number q 
is identified with the cut containing all rational numbers p such that p < q, so 
for example the cut which represents 1.5 contains the numbers 1.49, 1.499 and 
1.499999999999999, but it does not contain 1.5 itself. 

A natural question we might ask is if cuts are really going to represent real 
numbers then how do we do arithmetic with them? Well we have to redefine 
addition and multiplication, so for example if c and d are cuts then c + d is 
the cut which contains all rational numbers p + q where p is in c and q is in d. 
Multiplication is a little more complicated to define. Nonetheless it can be shown 
that these definitions coincide with our usual notions (so e.g. 2 + 3 still equals 
5) and all the usual algebraic laws of numbers continue to hold for cuts such as 
a(b + c) = ab + ac. Also if c and d are cuts we say that c < d if d contains at least 
one rational number that is larger than every rational number in c. 

It’s important to appreciate that the purpose of cuts is to give us a systematic 
approach to defining real numbers from rationals. They will not help us to do 
practical calculations with numbers but they do give us a firm foundation on 
which to rest the vast edifice of theorems about real numbers, including those 
we’ve met earlier in this book. 


11.2 Cauchy Sequences 


An alternative way of constructing the real numbers other than using cuts is 
to employ Cauchy sequences. Let’s first define this concept within the spirit of 
Chapter 4 where we assume that real numbers exist. Then we’ll see how the concept 
can be tweaked to actually produce real numbers from rationals. We say that the 
sequence (x n ) is a Cauchy sequence if given any e > 0 there exists a natural number 
N such that \x n — x m \ < e whenever both of m, n > N. This looks remarkably like 
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the definition of a limit except that we haven’t asked that the sequence converges. 
A Cauchy sequence has the property that if you go far enough along it, then 
the terms of the sequence become arbitrarily close. It is easy to prove that every 
convergent sequence is Cauchy. For suppose that (x n ) converges to a limit l then 
using the MFT and the triangle inequality 

\x„ - X m \ = \{x n -/) + (/- x m )\ < \x„ -l\ + \l- x m \, 

and I’ll leave the rest to you. Conversely, Augustus-Louis Cauchy (1789-1857) 
proved that every Cauchy sequence converges to a limit but he was assuming that 
real numbers exist. Let’s drop that assumption and return to a world where the 
only numbers are the rationals. 

Cantor first redefined Cauchy sequences using rational numbers only. So for 
the rest of this section a Cauchy sequence is a sequence ( x n ) of rational numbers 
which has the property that, given any rational e > 0, there exists a natural number 
N such that \x n — x m \ < e whenever both of m, n > N. Cantor’s idea was to define 
the real number line as the collection of all (rational) Cauchy sequences. So for 
example the rational number 1 should be identified with the Cauchy sequence 

OO 

(1,1,1,...). But there is a problem here, as the geometric series ^ 2 '„ also 

n= 1 

converges to 1. To overcome this problem we need one more definition. Two 
Cauchy sequences (x n ) and (y n ) are said to be equivalent if, given any rational 
e > 0, there exists a natural number N such that if n > N then \x„ — y n \ < e. 
This is clearly the case for the two sequences above which both converge to 1. 
Cantor’s approach was to identify equivalent Cauchy sequences as representing 
the same real number. 3 So ~Jl is identified with the Cauchy sequence that begins 
1, 1.4, 1.41, 1.414, 1.4142, 1.41421, 1.414213, 1.4142136, . . .) and also with the 
sequence (xj defined recursively by x 1 — 1 and for n > 1 , x n+ , = y + (see 
Section 5.4). If we use Cauchy sequences to define real numbers then addition 
and multiplication are both fairly easy - so if the Cauchy sequences ( a n ) and ( b n ) 
represent the real numbers a and b respectively then a + b is represented by the 
Cauchy sequence whose nth term is a n + b n while ab corresponds to the Cauchy 
sequence with «th term a n b n . To capture the notion of ‘less than’ we need to be 
more subtle - we say that a < b if given any rational e > 0 there exists a natural 
number N such that if n > N then b n — a n > e. 


11.3 Completeness 


Both definitions of real numbers that we have given require us to think about 
numbers in radically new ways. They each employ sets of rational numbers to 


3 This really requires the notion of an equivalence relation. 
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represent real numbers - either gathered together in cuts or assembled into 
Cauchy sequences. 4 However you define real numbers, the power of the definition 
rests in its ability to give a thorough logical foundation for further investigation. 
One of the most important results, which flows from the construction of the real 
numbers from either Dedekind cuts or Cauchy sequences, is the completeness 
property of the real numbers. We will now prove this important property using 
Dedekind cuts. 5 

Theorem 11.3.1. Let A be a non-empty collection (or set) of real numbers that is 
bounded above. Then it has a least upper bound. 

Let’s be clear what the theorem says. A non-empty collection of numbers means 
that we have assembled at least one (and possibly infinitely many) numbers and 
we call this collection A. Bounded above means that there exists a real number 
K such that if a is any of the numbers that is in A then we must have a < K. 
Finally to have a least upper bound means that the collection of all possible K 
that do this job has a smallest member which we call sup(A). A sequence (a„) is 
an example of a collection of numbers so Theorem 11.3.1 guarantees that every 
sequence that is bounded above has a supremum and this underlines everything 
we did in Chapter 5. 

Proof of Theorem 11.3.1. In this proof I’ll have to use set theoretic notation (see 
Appendix 2). So if A is a set of numbers then B C A means B is a (proper) subset 
of A (i.e. every number in B is also in A but A contains at least one number that is 
not in B ) and A C B means either A C B or A = B. 

Now let A be a non-empty subset of the real numbers so A contains at least 
one cut. Assume that A is bounded above and let f be an upper bound for A. This 
means that c C f for every cut c in A. Define a to be the set that contains every 
rational number in every cut in A (i.e. a is the union of all the cuts in A). We will 
show firstly that a is a cut and secondly that a — sup(A). Then the theorem will 
be proved. 

To show that a is a cut we must check that (Ci) to (Civ) are satisfied. Let c 
be a cut in A. Every rational number q in c is also in a so (Ci) is satisfied. If q is 
any rational number in a then q lies in one of the cuts that form a, and so every 
rational number smaller than q is in that cut and hence in a, so (Cii) is satisfied. 
For (Ciii) if a contains a largest rational number then so does one of the cuts that 
form a and that’s a contradiction. Finally, it is clear that a C f and (Civ) follows 
as any rational number that is larger than every rational number in f is certainly 
not in a. 


4 Technically speaking - equivalence classes of these. 

5 You can find an online proof using Cauchy sequences sketched on http://en.wikipedia. 
org/wiki/Construction_of_the_reaLnumbers#Construction_from_Cauchy_sequences 
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Now to show a = sup(A), suppose that / < a (as numbers). We’ll show that 
y cannot be an upper bound for A. As y C a (as sets) there must exist a cut d in 
a so that d contains a rational number that is greater than every rational number 
in y. But then y cannot possibly be an upper bound for A and so we are done. □ 

Once we have Theorem 11.3.1 we can see that a Dedekind cut c is essentially 
identified with the unique real number sup(c). Although we won’t prove it here, 
it’s an important fact that the completeness property is equivalent to the statement 
that every Cauchy sequence has a limit in the real numbers. 

One of the consequences of completeness (Theorem 11.3.1) is a key property 
of the real numbers that is named in honour of the great Archimedes of Syracuse 
(c.287-c.212 BCE). We used this property without proof in Chapter 4 to show 
that lim„^ OQ - = 0. Now let’s see how it is derived from Theorem 1 1.3.1. 

Theorem 1 1.3.2 (Archimedean Property of Real Numbers). Let x and y be arbi- 
trary positive real numbers. Then there exists a natural number n such that 

nx > y. 

Proof. Suppose that the statement is false so that nx < y for all n. Then the 
set A of all numbers of the form nx (where n is a natural number) is non-empty 
(it contains x ) and is bounded above by y. Hence A has a least upper bound by 
Theorem 11.3.1. Write a = sup(A) and pick any natural number n. Then (n + l)x 
is in the set A and so (n + l)x < a. It follows that 

nx < a — x < a. 

As this argument works for any n we see that a — x is also an upper bound for A 
and that contradicts the fact that a is the smallest of these. □ 

We’ll finish this chapter with a delightful and intriguing property of the real 
numbers. This is sometimes referred to as the density of the rational numbers in 
the real numbers. 

Theorem 11.3.3. Given any two real numbers x and y with y > x there exists a 
rational number q such that 

x < q < y 

Proof. As y — x > 0 we can apply Theorem 1 1.3.2 to find a natural number n 
such that n(y — x) > 1, i.e. 

nx + 1 < ny . . . (i) 

If x > 0 we can use Theorem 11.3.2 again to show that there exists a natural 
number m 1 such that m 1 > nx, and then any natural number m 2 has the property 
that m 2 > —nx. A moment’s reflection convinces us that we can drop the 
requirement x > 0 and two such numbers still exist. So — m 2 < nx < m l for 
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any real number x. Since nx lies between two integers, we can narrow this down 
and find an integer m such that m — 1 < nx < m, i.e. 

m<nx + l<m + l (ii) 

Combining (i) and (ii) together we get nx <m < ny, and so 

m 

x < — < y. 
n 

Hence — is our required rational number. □ 

If x and y are both irrational numbers then Theorem 11.3.3 tells us that there 
exists a rational number q such that x < q < y, i.e. there is a rational number 
between every pair of irrationals. On the other hand, in Theorem 3.1.1 we showed 
that there are infinitely many irrational numbers between every pair of rationals. 
This seems to suggest that there are as many rationals as irrationals, but we 
saw in Chapter 9 that the irrationals are of a higher order of infinity than the 
rationals. This gives us yet more insight into how complex and counter-intuitive 
the structure of the real number system is. 
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Where to Next in Analysis? 
The Calculus 


Next to the creation oF Euclidean geometry, the calculus has proved to 
be the most original and FruitFul concept in all oF mathematics. 

Mathematics and the Physical World, M. Kline 


T he end of the last chapter was the effective conclusion of this book. But some 
readers may be eager for more and this chapter is an attempt to answer the 
question - where do I go from here? The simple answer is learn calculus (if you 
haven’t already done so), pick up a more advanced textbook on analysis (see 
Further Reading for some suggestions) and start to work systematically through 
it. In this short chapter I’ll just give a sneak preview of some of the key new ideas 
that you’ll meet next. 


12.1 Functions 


The function concept has been something of a ‘ghost at the feast’ so far and yet it 
is one of the most fundamental ideas in the whole of mathematics. Functions 1 are 
the way in which mathematics describe relationships. In the simplest form these 
will be relationships between two variables x andy. As we explore different values 
of x, the variable y also changes in a predictable (but perhaps quite complicated) 
manner. The way in which the value ofy depends on that of x is expressed through 
the function concept and we write y — f(x) to formally describe the relationship. 
This is shorthand for ‘y is a function of x . To give a logically watertight definition 
of a function needs more set theory than I want to go into here so we’ll skip 


Nowadays these are often called 'mappings'. 
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Figure 12.1. f(x) = c with c > 0. 

that and concentrate on examples. 2 To see the simplest function just pick a real 
number c and defme/(x) = c. This is called the ‘constant function’, as each real 
number x is inexorably mapped to the number c. We get insight into functions 
by drawing their graphs and the constant function is shown in Figure 12.1. 

Our next example, shown in Figure 12.2, is y = mx where m is fixed. If (say) 
m — 2 we have y = 2xandso/(0) = 0,/(l) = 2,/(— 0.48) = — 0.96,/(2) = 4 etc. 
The graph of the function is a straight line through the origin which has positive 
slope if m > 0 and negative slope if m < 0. 

At this stage, it is worth pointing out that/(x) and/ are different mathematical 
entities which should not be confused with each other. f(x) is a number - it is 
the value of the function at the point x. On the other hand, / is our symbol for 
the function itself which represents a relationship. It is not a number and can be 
seen as a higher-order mathematical object. Later on we will look at the next stage 
where we will meet an operator. Just as functions take numbers to numbers so (at 
the next level up), operators map functions to functions. 

Now let’s get back to examples. We’ve met/(x) = c and/(x) = mx. Let’s 
combine them and consider /(x) = mx + c. This is the general linear function, 
shown in Figure 12.3. Any infinite straight line drawn in the plane appears as the 
graph of such a function. As above it has slope m, and c measures the distance 
from the origin to the point where the line meets the y axis. 

If the highest power of x that appears in the definition of a function is 2 then 
the function is said to be quadratic. The simplest such function is /(x) = x 2 and 
the most general is/(x) = ax 2 + bx + c where a, b and c are real numbers with 
a ^ 0. In this case the graph (Figure 12.4) is always a parabola. 

2 In general, we can make sense of the function concept as a mapping between two 
arbitrary sets. 
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Figure 1 2 A. f(x) = x 2 . 


At this stage we could go on to consider general cubic functions, /(x) = ax 3 + 
bx 2 + cx + d or quartic functions /(x) = ax 4 + bx 3 + cx 2 + dx + e, but let’s go 
even further and write down the most general polynomial 

/(x) = a n x n + a„_jx" -1 + • • ■ + a 2 x 2 + a x x + a 0 , 

where a 0 , a 1 , a 2 , . . . , a n _ 1 . «„ are real numbers. If a n /= 0 then this is called a 
polynomial of degree n, e.g./(x) = x 24 — 17x 9 + 14.673 is a polynomial of degree 
24. We can write the general polynomial more succinctly using sigma notation: 

n 

fix) = a r X f 

r= 0 

and it is natural to ask if we can generalise even further and consider functions 
that are defined by infinite series: 

OO 

fix) = J2 a r x r . 

r = 0 

We have to be careful here for we need the series on the right-hand side to 
converge for a range of values of x. Any function/ which is legitimately defined 
by a properly convergent series is said to have a power series expansion . 3 In fact, 

OO 

the general theory of power series tells us that if ^ a r x r converges then it does so 

r=0 

for all x in an interval of convergence (-R , R). The positive real number R is called 
the radius of convergence. It then follows that the series diverges if |x | > RAix = R 

3 The terminology 'power series’ comes from the fact that we are using a series that involves 

powers ofx. 
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x 


Figure 1 2.5. f(x) = e* 


and —R the general theory is inconclusive and the series may either converge or 
diverge, so a case-by-case consideration is needed. If the series converges for all 
values of x we say that R = o o. An example of such a function that we’ve already 
met in Chapter 7 is the exponential function f(x) = e x (Figure 12.5). We recall 
that in this case: 


Two other well-known functions which converge for all values of x are the 
trigonometric functions/!*) = sin(x) and/(x) = cos(x), as shown in Figure 12.6. 
We first meet these as expressing important ratios of the sides of a right-angled 
triangle in basic trigonometry. When they are revealed as functions they are basic 



2 



-1 


f(x) = sin (x) 


fix) = cos (x) 


x 


Figure 1 2.6. f(x) = sin(x) and f(x) = cos(x). 
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models for periodic phenomena such as wave motion. In this case the power series 
expansions are 


OO 

sin(x) = ^(-1)" 

n=0 


y-2n+l 

(2n + 1)!’ 


OO 

cos(x) = 

n = 0 


x 2n 

( 2 n )\ ' 


For an example of a function that has a radius of convergence R < oo, consider 
/(x) = -pr- . By the binomial series, 4 we have 


1 


l — x 




for —1 < x < 1. If x > 1, you can see for yourself that the series diverges. So 
R = 1 in this case. 


12.2 Limits and Continuity 


Analysis really takes off when the concepts of function and limit meet and interact. 
We begin by defining the limit of a function at a point x. Let (x„) be any sequence 
that converges to the point x. Now consider the sequence (f(x„)). If ( f(x n )) 
converges to a real number l , for every sequence (xj that converges to x, we 
say that the function has a limit l at the point x, and we write 

lim f(y) = l. 

y—>x 

It turns out that this is equivalent to the following e — <5 formulation: we say that 
lim x /(y) = l if given any e > 0, there exists <5 > 0 so that whenever \y — x\ <8 
we must have f(y) — l\ < e. 

One of the most important applications of the limit concept is to the concept of 
continuity. The idea is to describe (using precise mathematics) a function that is 
continuous, i.e. it has the property that its graph can be drawn using a pencil (say) 
in one continuous flow without ever taking the pencil from the paper. We say that 
a function/ is continuous if it is continuous at every point x and it is continuous at 
x if lim x /(y) =/(x). So the function/ is continuous at x if for every sequence 
(x n ) that converges to x we must have that the sequence ( f(x n )) converges to /(x). 
All polynomials are continuous and so are the functions/(x) = e x ,f{x) = sin(x) 
and /(x) = cos(x). 

Here’s an example of a function that fails to be continuous. In fact it doesn’t 
even have a limit at x = 0. It is called the Heaviside function, after the engineer 


41 See Appendix 1 if you aren't familiar with this. 
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y 


fix) 


x 


Figure 12.7. The Fleaviside Function. 


Oliver Heaviside (1850-1920) who first made use of it in applications. 5 It is 
defined by: 


fix) = 


0 if v < 0 

1 ifx > 0 


The graph of/ is shown in Figure 12.7. 

The Heaviside function / can represent a signal that is turned on to have 
a value of 1 unit at the point x = 0. From the graph it seems obvious that / 
is not continuous at x = 0. To see how to prove this rigorously, consider the 
sequence (x n ) where each x n = —K Clearly ( x n ) converges to 0. Now for each n, 
f(x n ) = f (— i) =0 and so we also have that (f(x n )) converges to 0. But/(0) = 1 
and so we have found a sequence {x n ) which converges to 0 but (f(x n ) ) doesn’t 
converge to/(0). So/ cannot be continuous at the point x = 0. 6 


12.3 Differentiation 

L A 


In this section and the next, we’ll briefly describe how the conceptual framework of 
analysis enables us to give rigorous meaning to the two key ideas of the calculus - 
differentiation and integration. 

Differentiation is the process whereby we calculate the rate of change of a 
function at a point. The diagram in Figure 12.8 may give some insight into this. 


5 He also discovered the Heaviside layer in the atmosphere, see http://en.wikipedia.org/ 
wiki/Oliver_Heaviside. Nowadays mathematicians oFten use the terminology indicator function 
to describe functions that take the value 1 on some interval and are zero for all other values ofx. 

6 Of course 7 is continuous at every other point. Can you prove this using precise 
mathematics? 
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We consider two points on the x-axis which are a very small distance apart. 
It’s traditional to use the symbol Ax to represent a small distance. In the diagram 
Ax > 0 and so our two points that are very close together are x and x + Ax. 
If y —f(x), we’ll define Ay = f(x + Ax) — f(x). If/ is a continuous function, 
then lim Ax ^ 0 Ay = 0. The rate of change of the function between the two points 
x and Ax is We want to measure the rate of change at the point x itself. 
Geometrically this is precisely the slope of the tangent line to the curve/ at the 
point x. We might argue that this should be the value of ^ at Ax = 0, but if/ 
is continuous, this is | which has no meaning. Isaac Newton (1643-1727) and 
Gottfried Leibniz (1646-1716) (working independently) 7 realised that 8 although 
^ has no meaning at Ax = 0, it is quite possible for lim Ax ^ 0 ^ to exist and 
indeed this is the case for many important functions. To be precise, we say that a 
function/ is differentiable at the point x if this limit exists and in this case we define 


f'(x) = lim 
h — ^ 0 


f{x + h) -f{x) 
h 


7 I'm not going to comment on the priority dispute that so obsessed many of their followers 
and remains the subject of scholastic enquiry to this day. 

8 Of course, Newton and Leibniz did not know the precise definition of a limit but they 
had sufficient insight to be able to make the methodological breakthrough that established the 
calculus as a working tool for mathematicians and scientists. 
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If f'(x) exists at every x then/' is a new function called the ‘derived function’ 
or derivative of/ at the point x. To symbolise the fact that/' is a limit of ratios, 
we often use the convenient notation ^ = /'(x); however I would emphasise 
that is a holistic notation. There is no mythical ‘dy’ that is to be divided by 
an equally mythical ‘dx’. The table below gives some derivatives of well-known 
functions: 


f(x) 

f\x) 

c 

0 

x n 

nx" _1 

log(x) 

i 

X 

e x 

e x 

sin(x) 

cos(x) 

cos(x) 

— sin(x) 


As a fun exercise, take the fact that the derivative of x n is nx" _1 as given and 
try to derive the derivatives of e x , sin(x) and cos(x) by using the series expansions 
for these functions that were given in the previous section. If you think carefully 
about this you’ll see that it involves interchanging two limiting procedures and 
that really needs to be justified carefully. 

The importance of the derivative in science, engineering, economics, etc. arises 
from the fact that rates of change are so commonly encountered in applications. 
For example, if we take t to be time and y to be the displacement of a moving 
object from its starting point then ^ is precisely the instantaneous velocity of 
the moving object. If we have an electric circuit and Q is the quantity of charge 
moving past at time t then ^ is the electric current. 

Returning to mathematics, it can be shown that if a function/ is differentiable 
at a point x then it must be continuous there, so (reversing the logic) discontinuous 
functions cannot have derivatives. 9 Differentiation is an example of an operator 
which is a function that changes functions to other functions. The study of general 
operators is a major part of modern analysis that forms a subject of its own which is 
called functional analysis. This is vital for a wide range of applications - especially 
quantum theory, the study of matter at the molecular, atomic and sub-atomic 
levels. 

9 At least not at the points of discontinuity. 
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12.4 Integration 


The roots of integration go back to the Greek geometer Eudoxus of Cnidus 
(c.408-c.355 BCE) (see Chapter 13) but we’ll again pick up the story in the 
seventeeth century. The problem is to calculate the area under a curve which 
is described by a function/. For convenience we’ll assume that/(x) > 0 for all 
x and that we want to calculate the area between the points x — a and x — b, as 
shown in Figure 12.9. 

Again the key breakthrough in methodology was made by Newton and Leibniz, 
but the solid foundations in analysis had to wait until the work of Riemann. 10 
To see how to calculate the area under the curve we’ll assume that the function/ 
is bounded on the interval [a, b] so there exist two nonnegative numbers m and 
M such that m < f(x) < M for all a < x < b. We need to partition the interval 
[a, b] into a set of points V = {x Q ,x 1 , . . . , x n , x n+1 }, where we insist that x 0 = a, 
x n+1 = b and x 0 < x 1 <■■■,< x n < x n+l . We will not assume that the points 
in the partition are equally spaced. We define a crude approximation to the area 
under the curve (Figure 12.10) by overshooting on each interval [x , x J+l ], To do 
this we define Mj = sup x <3c<x+i /(x). As/ is bounded above, Mj is finite for all 
j = 0, 1, . . . , n. We add up the area of all the rectangles that overshoot and this is 

n 

precisely £ M(x +1 - %.). 
i = o 


10 The approach that well give is in fact due to Jean-Gaston Darboux (1 842-1 91 7), see 
http://en.wikipedia.org/wiki/Jean_Gaston_Darboux 
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Now imagine that we carry out this same calculation for all possible partitions 
of [a, b]. Clearly some overshoot approximations will be better than others in the 
sense that they get closer to the area that we want. We define the upper integral 
to be the best approximation that we can get in this way, i.e. the number which 
delicately undercuts all the overshoots. This is 

~~7b « 

/ f(x)dx = inf J2 M j(x j+1 - Xj). 

Ja ;= 0 

The symbol on the left is the notation for upper integrals. Indeed we should 
think of an integral as a continuous version of a sum and Leibniz had the idea 
of representing this by an elongated S that is stretched out at both the top and 
bottom . 11 

Having got the best possible approximation to the area from above, let’s do 
the same thing from below. Now we systematically undershoot and for a fixed 

n 

partition V we define the undershoot area to be ^ mAxj +l — x) where m ■ = 

j = o 

inf *;<*<*;+! /(*)■ 


1 1 The extension of the Leibniz notation to upper and lower integrals seems to have been 
introduced by Darboux. 
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The lower integral will delicately overtrim all the undershoots and this is 
defined by 



sup 
v j = o 


-*,)• 


Now it’s a deep insight due to Darboux that the area under a curve makes 
sense (remember / is bounded but may not be continuous) only when 
delicately undercutting all the overshoots (see Figure 12.11) and just as delicately 
overtrimming all the undershoots gives precisely the same answer. So we say that 

a function/ is integrable on [a, b] if f b f(x)dx — f b f(x)dx. In this case we write 

f b f{x)dx for the common value and call it the integral of the function/ over 
the interval [a, b]. It is a fact that if/ is continuous on such an interval then it is 
integrable. 

We can see how the area grows by changing the fixed point b to a variable y. It 
turns out that there is always a continuous function F such that 

y 

f(x)dx = F(y)—F(a). (12.4.1) 

The function F is called an indefinite integral of/. 12 

Note from (12.4.1) that the function F is not unique - if F is a definite integral 
then so is F + k where k is any fixed real number. The number k is usually 


12 It is also sometimes called a primitive or antiderivative. 


168 


12.4 INTEGRATION 


called an arbitrary constant. In applications, it generally has a specific value that 
is determined by the way the world is. Some values of indefinite integrals for 
standard functions are shown below (without k). 


t(x) F(x) 


c 

cx 

x n 


1 

X 

log(x) 

e x 

e x 

sin(x) 

— cos(x) 

cos(x) 

sin(x) 


In basic calculus courses, the indefinite integral is often written with the 
same symbol as the definite one but without the upper and lower limits b 
and a: 

J f{x)dx = F(x) + k. 


It is common to teach the method of going from f to F first as the reverse 
operation to differentiation and then to make the connection afterwards with 
the area under the curve y = f(x) between x — a and x — b by arguing 
that f b f(x)dx = F(b) — F(a). The work of Riemann, Darboux and their 
followers shows that from a foundational viewpoint, the definite integral is more 
fundamental and the indefinite integral derives its meaning from that more basic 
concept. 

If you compare the two tables of derivatives and integrals you can indeed check 
that these are mutually inverse to each other, e.g. if we start with x n (n/ 1) and 
integrate we get . Now differentiate this and we come back to x n . This extends 
to an important general result. I pointed out earlier that if/ is integrable, then F 
is continuous. Moreover if/ is continuous it turns out that F is differentiable and 


dF(x) 

dx 


= /(*)• 


Conversely the so-called fundamental theorem of calculus tells us that if / is 
differentiable (hence continuous, and so integrable) 


/ 


f'(x)dx —f(b) -/(a). 
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This is a good place to stop - but in truth, we have only just started. We haven’t 
even mentioned some key highlights of calculus such as the mean value theorem 
and Taylor’s theorem, then we should extend to two or more dimensions and look 
at partial derivatives and multiple integrals and also study calculus and analysis 
in the complex plane. All of these topics (and more) are vitally important for both 
pure and applied mathematics. The libraries are stacked with books, the internet 
is awash with references and the gentle reader is invited to go forth and sample 
the delights that await them. 
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Some Brief Remarks About 
the History of Analysis 


T he history of mathematics is itself a vast subject. To isolate just one topic - 
analysis - and try to outline its history is a somewhat dangerous enterprise 
and the reader should be aware that the author is not a professional historian and 
is a lot less sure of his footing in this territory. Analysis did not really emerge 
as an area of mathematics with its own identity until the nineteenth century. 
Although we can trace the development of some important ideas from antiquity it 
is difficult to disentangle these from geometry and algebra (and later - calculus). 
Nevertheless, if the reader holds these caveats firmly in his or her mind, we’ll try 
to sketch our outline. 

The main mathematical story in this book concluded with integration. Maybe 
the history of analysis should start there for what is integration but the art of 
calculating areas (and volumes) of complicated shapes by using limits of areas 
of simpler shapes? The first to have used this idea systematically seems to be 
the Greek geometer Eudoxus of Cnidus (c.408-c.355 BCE). He saw, for example, 
that you could get a good approximation to the area of a circle by filling it up 
with inscribed polygons with increasing numbers of sides. Books XI, XII and XIII 
of Euclid’s celebrated Elements collects results of this type and the method was 
taken to even greater refinement by Archimedes of Syracuse (287-212 BCE). But 
I should emphasise that there was no formal limit concept in their work as has 
been developed in this book. The mathematical perspective in those days was 
geometric. 

From Greek mathematics, we now fast forward to the rise of the modern world 
in the Western hemisphere. The fifteenth and sixteenth centuries saw a renewed 
interest in Greek geometry and significant progress in algebra. Rene Descartes 
(1596-1650) began the process of unifying these through his introduction of 
co-ordinates. The work of Galileo Galilei (1564-1642) in mechanics started 
to stimulate mathematicians to worry about how to calculate instantaneous 
velocities. The origins of calculus can be found in the work of Pierre de Fermat 
(1601-65), Descartes and Isaac Barrow (1630-77) who was Newton’s predecessor 
at Cambridge, but their approach to the calculation of tangents to a curve was still 
dominated by geometric arguments. Important contributions were also made by 
John Wallis (1616-1703). 
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The development of a universal methodology that forged the calculus into 
a powerful working tool was due to Isaac Newton (1642-1727) and Gottfried 
Leibniz ( 1 646- 1 7 1 6) . The function concept was not yet established. N ewton wrote 
of a changing quantity which he called a ‘fluent’ and its derivative was a ‘fluxion’ 
(in his work differentiation was always with respect to time). To differentiate the 
fluent/, Newton allows /(x) to flow into/(x + o) and he writes ‘Let now the 
increment vanish . . . ’ when he extracts the fluxion from the ratio f < A +°)-/b) j s 

O 

became known as the ‘method of infinitesimals’ and here lies the origin of the 
modern definition of the derivative as a limit. Leibniz introduced the notation ^ 
for derivatives and f for integrals and was the first to see that differentiation and 
integration are mutually inverse. 

Calculus was intensively developed by the followers of Newton and Leibniz 
but the mathematics was not rigorous and attracted some criticism from e.g. the 
philosopher and cleric George Berkeley (1685-1753) who famously argued that 
infinitesimals had the status of ‘the ghost of departed quantities’. Towards the 
end of the eighteenth century it was clear that there was a notion of ‘limit’ that 
lay behind the unsatisfactory notion of infinitesimals, but it wasn’t yet clear 
how to define it! In his article ‘Limite’ 1 Jean d’Alembert (1717-83) wrote ‘The 
theory of limits is the true metaphysics of the calculus ...it is never a question 
of infinitesimal quantities in the differential calculus, it is uniquely a question of 
limits of finite quantities’. 

Leonhard Euler (1707-83) was called ‘analysis incarnate’ by his contempo- 
raries. He published an important two-volume treatise in 1748 called Introduction 
to Infinitesimal Analysis. This book pioneered the concept of a function as 
a relationship between two variables and also contained many results about 
infinite series and products and continued functions. Euler’s work exhibited a 
wonderful genius for calculating explicit values of series and integrals (some of 
which have been encountered in this book) and yet he had no precise concept 
of the notion of limit as we understand it. For example the delightful formula 
e = lim^oo (l + -)" which we met in Chapter 7 was discovered by Euler, but he 
wrote it as ‘e = (l + i) !> where i stood for ‘infinity’ (and not the square root of 
minus one in this instance). 

The breakthrough to understanding the real nature of the limit came in the 
nineteenth century and we should honour the contribution of a great trio of 
mathematicians. The Bohemian mathematician Bernhard Bolzano (1781-1848) 
realised that a function / is continuous at x if f(x + h) — /(x) can be made as 
small as you like if h is sufficiently small. Similar ideas were also independently 
developed by the French mathematician Augustus Louis Cauchy (1789-1857). 
Cauchy is a very important figure in the history of analysis. We have already 
met his root and condensation tests for convergence of series in Exercises 6.13 


1 In J. le R. d'Alembert and D. Diderot (eds ). Encylopedie ou dictionnaire raisonne des 
sciences, des artes et des metiers. Paris-Neuchatel-Amsterdam (1765). 
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and 6.14 (respectively) and the notion of a Cauchy sequence in Chapter 11. 
He also pioneered the important field of complex analysis whereby analytic 
techniques are extended to sequences and series comprising complex numbers 
and to complex-valued functions. Indeed this was an essential ingredient for 
Riemann’s work on the analytic continuation of the zeta function that we alluded 
to in Section 8.3. It was the German mathematician Karl Weierstrass (1815-97) 
who introduced the modern e — S definition of a limit and was finally able to 
give rigorous meaning to the notion of sufficiently small. 2 Much of Weierstrass’ 
work was carried out while he was a high school teacher in 1841-56 and didn’t 
become well known until he lectured at Berlin (where he obtained a professorial 
chair) in 1859. Earlier in 1854 Bernard Riemann (1826-66) (building on work of 
Cauchy) had established that the integral was essentially a limit of what are now 
called ‘Riemann sums’. 

Meanwhile a new front was opened up in analysis when Joseph Fourier 
(1768-1830) began to study trigonometric series - now called Fourier series in his 
honour. This opened up a whole range of new problems to solve and led Georg 
Cantor(1845-1918) to his research on set theory and the infinite. 

1 haven’t said much about the history of sequences and series in this brief 
chapter. It seems that the first systematic user of sequences in analysis was Cauchy 
in his Corns d’ Analyse de l Ecole Polytechnique which was first published in 1821. 
The first treatise on infinite series appeared a century earlier as an appendix to 
Jacob Bernoulli’s Ars Conjectandi which was published in 1713 - eight years after 
the author’s death. The main part of the book is about probability theory but as 
Bernoulli was one of the first to appreciate, this leads naturally to questions about 
limits and series (and that’s another big story). 

So infinite series were being manipulated and their ‘sums’ calculated long 
before the rigorous notion of a limit was established. Indeed in Chapter 4 we 
met the proof that the harmonic series was divergent due to Nicolas Oresme 
( 1 323?— 82) . By the time of the birth of calculus in the seventeenth century, infinite 
series were being used routinely by mathematicians and important advances were 
made by John Gregory (1638-75) and Nicolas Mercator (1620-87). 

With Weierstrass’ definition of the limit established, all the results obtained 
by earlier mathematicians on calculus and infinite series, products and continued 
fractions could now be made logically watertight. These subjects had now reached 
the form in which they are still typically taught in undergraduate mathematics 
courses. But the work of analysis was far from finished, indeed it could be argued 
that it had only just begun. As the twentieth century dawned, mathematicians 
such as Emile Borel (1876-1956) and Henri Eebesgue (1875-1941) explored an 
abstract concept called measure, which enabled a unified treatment of length, 

2 Note that Cauchy used e and S in some of his proofs, although not within his definitions. In 
fact he was responsible for introducing e into analysis as standing for 'error' (erreur in French) - 
see Judith Grabiner, 'Who gave you the epsilon? Cauchy and the origins of rigorous calculus'. 
American Mathematical Monthly 90, No. 3, 1 85-94 ( 1 983). 
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area, volume and higher- dimensional analogues as well as probability. Out of this 
work, Lebesgue developed a new theory of integration that was more flexible 
and powerful than Riemann’s. This became a key tool for one of the great 
developments of twentieth century mathematics - functional analysis - which 
saw analysis and linear algebra unified in a new powerful theory which allowed 
analytic techniques to be applied within infinite-dimensional spaces. This turned 
out to be vital for the new physics of quantum theory. Functional analysis is just 
one of many distinct areas within modern analysis that continues to be developed 
in the twenty-first century. Analysis is also finding new important application 
areas within mathematics itself such as fractals, dynamical systems and stochastic 
processes. Indeed analysis is well established as one of the fundamental strands of 
mathematics that reaches far into both pure and applied mathematics and there 
is much to be done in the future. 
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I n this section, I will try to address the question, “What should I read next?” What 
follows is very much a personal selection and a mixture of classics with the perhaps 
less well known. I hope no-one is offended by the even longer list of high quality books 
that I’ve omitted here. Textbooks are all marked (T). 

1 . General, with emphasis on numbers 

(T) R.B.J.T. Allenby, Numbers and Proofs, Butterworth-Heinemann (1997). 

For those who need a solid grounding in how proof works in mathematics this gives a 
very thorough and readable textbook account. There is also an introduction to set theory 
and proof by induction and nice material on prime numbers, integers, rationals, reals 
and complex numbers. 

T. Dantzig, Number, The Language of Science, George Allen and Unwin Ltd (1st 
edition 1936, 4th edition 1962). 

This classic account of the history of the number concept from the time of ancient 
Sumerians and Egyptians to Cantor’s theory of the infinite should be accessible to any 
reader of this book. 

J.H. Conway, T.K. Guy, The Book of Numbers, Springer Science+Business Media 
(1996). 

If you are fascinated by the patterns that are created by interesting numbers then this is 
the book for you. You can have fun playing with numbers and shapes, e.g. by thinking 
about rhombic dodecahedral numbers and also learn about some quite sophisticated 
topics in number theory such as Ramanujan numbers. The whole book is pervaded by a 
glorious sense of fun. 

M. Gazale, Number: From Ahmed to Cantor, Princeton University Press (2000). 

This is another history of the development of numbers and has some similarities with 
Dantzig’s book, but this one contains more mathematics, some of which is quite detailed 
and demanding. For example there is material on the arithmetic of numbers written 
in different bases, on continued fractions and an approach to constructing irrational 
numbers called ‘cleavages’. 

2 . General, broader scope 

E. Kasner, J. Newman, Mathematics and the Imagination, G. Bell and Sons Ltd (1949). 

A classic introduction to mathematics for the general reader, but this is also well worth 
reading for students (and professional mathematicians). It contains a nice account of 
Cantor’s theory of the infinite. 
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(T) R. Courant, H. Robbins (revised by Ian Stewart), What is Mathematics ?, Oxford 
University Press (first edition 1941, second revised edition 1996). 

A wonderful introduction to the subject which might be perfect reading on those long 
hot summer days between school and university. The book covers a lot of ground: 
numbers, algebra, projective geometry, elementary topology and calculus and includes 
an introduction to limits and continuity. 

L. Garding, Encounter with Mathematics, Springer- Verlag New York Inc (1977). 

A survey of mathematics (some quite advanced) for budding mathematicians. This book 
covers number theory, algebra, geometry, linear algebra, analysis, topology, calculus, 
probability and applications. If you read this book before first year of university just after 
Courant and Robbins then you might even be able to teach some of your lecturers a thing 
or two. 

M. Aigner, G.M. Ziegler, Proofs From the Book (second edition), Springer- Verlag 
Berlin, Heidelberg, New York (2001). 

Paul Erdos (1913-96) was one of the leading problem-solving mathematicians of the 
twentieth century and was certainly a unique and remarkable character. To learn more 
about his life and work you might read his biography, The Man Who Loved Only 
Numbers by Paul Hoffman, Fourth Estate, London (1998). Erdos loved beautiful proofs 
and whenever he found one he said it should go into ‘The Book’ (which is perhaps kept 
by God, who he also referred to as the ‘Supreme Fascist’). This lengthy preamble is simply 
designed to explain the title of this collection of beautiful mathematical proofs in the five 
selected areas of number theory, geometry, analysis, combinatorics and graph theory. 
Although they are in some sense ‘standard’, the proofs that I gave for the irrationality of 
e and jx are strongly influenced by those that appear here. 

3 . Textbooks on Analysis - first course level 

These books are listed in increasing order of difficulty, at least as far as I can judge. 

(T) R.P. Burn, Numbers and Functions, Steps into Analysis, Cambridge University 
Press (1992). 

This covers a standard course in analysis dealing with the ‘usual material’ of sequences, 
series, functions, continuity, differentiation and integration. A unique feature is that most 
of the book consists of guided exercises for the student so you learn by doing rather than 
just reading. 

(T) K.G. Binmore, Mathematical Analysis, A Straightforward Approach (second 
edition), Cambridge University Press (1982). 

This is one of the very best textbooks covering a basic first year undergraduate course in 
analysis. It is well written, nicely explained and has a good collection of exercises. As well 
as the ‘usual material’ there are also very welcome chapters on the gamma function and 
differentiation in higher dimensions. 

(T) D. Bressoud, A Radical Approach to Real Analysis (second edition), The 
Mathematical Association of America (2007). 

In his introduction the author argues that the ‘usual approach’ is ‘the right way to view 
analysis but the wrong way to teach it’. He prefers a route that is highly informed by the 
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history of the subject so that students can appreciate how the great masters themselves, 
such as Euler and Cauchy, struggled with the problems of defining functions and limits. 
This is a very interesting book and it might be helpful to read this in parallel with taking 
a traditional course. 

(T) T. Tao, Real Analysis I (second edition), Hindustan Book Agency (India) (2009). 

There is no Nobel prize for mathematics. 1 Instead we have the Fields medals which 
are awarded every four years to mathematicians under the age of 40 who have made 
outstanding research contributions to the subject. Terence Tao (b.1975) was awarded a 
Fields medal in 2006. He has also written a highly innovative two-volume textbook on real 
analysis. His viewpoint is that a thorough treatment of foundations is required before 
we encounter the ‘usual material’ and so he gives an extensive set theoretic treatment 
of the natural numbers, integers, rational numbers and real numbers (including the 
construction of the latter using Cauchy sequences) in the first 120 pages. Volume 2 
deals with more advanced topics including Fourier series, metric spaces and Lebesgue 
integration. 

(T) D.B. Scott, S.R. Tims, Mathematical Analysis, An Introduction, Cambridge 
University Press (1966). 

I include this book (which is sadly out of print) as I first learned analysis from it and it 
has a special place in my heart. I read it when I was in the sixth form as my maths teacher 
advised that anyone intending to continue with mathematics at university should study 
it. How right he was! 

(T) S.G. Krantz, Real Analysis and Foundations (second edition), Chapman and 
Hall/CRC (2005). 

This is a very wide-ranging book. It starts off with basic material on logic, numbers, 
sequences and series. Before we complete the ‘usual material’ of limits, continuity and 
calculus there is a chapter on topology. The book also contains material on differential 
equations, Fourier series, multivariate calculus and it ends by giving ‘a glimpse of wavelet 
theory’. 

(T) E. Hairer, G. Wanner, Analysis By Its History, Undergraduate Texts in 
Mathematics, Springer (1996). 

This takes a similar viewpoint to Bressoud’s book in teaching analysis through its history, 
but it is aimed at a more sophisticated readership. 

(T) W. Rudin, Principles of Mathematical Analysis (third edition), McGraw-Hill Inc 
(1976). 

This wonderful book is, in my opinion, the all time- winner of the ‘best text on elementary 
analysis’ competition. But it is not for the squeamish - after introducing number systems 
it goes straight into point set topology! I advise you not to read it directly after the current 
book. First go to one of the texts I listed earlier for the first course and then read this 
afterwards as the main meal. 


1 There is no evidence whatsoever to support the oft-told story that this is because a 
mathematician had an affair with the mistress of the prize’s founder, Alfred Nobel (1 833-96). 
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k. Textbooks on Number Theory - hirst course level 

(T) J. Stopple, A Primer of Analytic Number Theory, Cambridge University Press 
(2003). 

Number theory is a vast subject and analytic number theory, i.e. that part of it which 
relies on analytic methods, is often seen as one of the most challenging and difficult. This 
beautiful book makes the subject about as accessible as it can be made. The early chapters 
should even be readable for first year undergraduates. 

(T) H. Davenport, The Higher Arithmetic (seventh edition), Cambridge University 
Press (1999). 

A classic book that is aimed at general readers although it certainly requires some 
mathematical sophistication. If you want to learn more about continued fractions, then 
I recommend looking here as your next step. 

(T) G.H. Hardy, E.M. Wright, An Introduction to the Theory of Numbers (fifth 
edition), Oxford University Press (1979). 

G.H. Hardy (1877-1947) was one of the greatest UK mathematicians of the twentieth 
century and he was an analyst par excellence. This is quite an advanced book and yet 
there are parts of it (such as the first two chapters on prime numbers and the fourth on 
irrationals) which you should be able to dip into and learn from. 

5 . Books About Special Numbers 

P. Beckmann, A History of it. The Golem Press (1971). 

This book gives a short history, from the birth of the human race up to the dawn of the 
computer age, of those parts of mathematics which are associated with the calculation 
of n or the understanding of its importance. So for example, Chapter 14 gives a five 
page biography of Euler followed by five pages of his mathematics based around his 
contributions to n. 

E. Maior, e: The Story of a Number, Princeton University Press (1995). 

The book begins in 1614 with the discovery of logarithms by John Napier (1550-1617) 
and traces the history of calculus and analysis to the nineteenth century with a particular 
emphasis on e, e x and e' x . Among the gems included here you will find an account of an 
imaginary meeting between Johann Bernoulli and J.S. Bach. 

J. Havill, Gamma, Exploring Euler’s Constant, Princeton University Press (2003). 

This book kept me thoroughly absorbed on what looked like being a very boring 
transatlantic flight (with a screaming 3-year-old sat next to me). If you think that y 
is less interesting than e or n, then this will convince you otherwise. 

E. Maior, To Infinity and Beyond, Princeton University Press (1991). 

This book is subtitled ‘a cultural history of infinity’. From a mathematical point of view, 
not only does it cover the analytic approach to infinity which has been featured in the 
book that you are reading, but also the way infinity appears in geometry (e.g. the ‘point at 
infinity’ in projective geometry). You can also read about the role of infinity in aesthetics 
(e.g. in the paintings of Morris Escher) and the cosmological significance of infinity within 
modern theories of the universe. A joy to read! 
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B. Clegg, A Short History of Infinity, Constable and Robinson (2003). 

This book takes a broad historical view of how our ideas of the infinite have evolved, 
starting with Zeno’s paradoxes of motion and the philosophical views of Aristotle and 
St Thomas Aquinas and continuing with the discovery of calculus and the mathematical 
work of Cantor and Godel. Very readable and informative. 

6 . History 

M. Kline, Mathematics in Western Culture, Penguin (1953). 

This is a marvellous book written by one of the twentieth century’s greatest popularisers 
of mathematics. Here you can cross the art/science divide to learn how fifteenth-century 
painters created the mathematics of projective geometry so that they could incorporate 
perspective into their paintings. 

M. Kline, Mathematical Thought From Ancient to Modern Times, Oxford University 
Press (1972), reprinted in three volumes (1990). 

Possibly the most comprehensive history of the subject written so far it begins with the 
Babylonians about 3000 BCE and ends with developments in set theory and mathematical 
logic in the early part of the twentieth century. 

C. B. Boyer, U.C. Merzbach, A History of Mathematics (third edition), J. Wiley and 
Sons Ltd, (2011). 

This is the best known of the histories. It covers similar ground to Kline’s three volumes 
but takes us a little further into the twentieth century. The first edition was solely due to 
Boyer. 

I. Grattan-Guinness, The Rainbow of Mathematics, W.W. Norton and Co (1997). 

Another history of mathematics from ancient times to the early twentieth century but 
here there is a greater emphasis on applications to the physical world. 

I. James, Remarkable Mathematicians, Cambridge University Press (2002). 

Contains biographies of 60 mathematicians (including three women) who are allotted 
about five to seven pages each, starting with Euler in the eighteenth century and finishing 
with von Neumann in the twentieth century. 

J. Fauvel, J. Gray (eds.), The History of Mathematics: A Reader, Macmillan Press Ltd 
(1987). 

The great mathematicians in their own words. For example, here you will find some of 
Newton’s writings on the calculus, Cauchy’s on convergence of sequences and Cantor’s 
on defining the real numbers. 

W. Dunham, Euler: The Master of Us All, The Mathematical Association of America 
(1999). 

This is not a biography of Euler. It is an introduction to his wonderful mathematical 
achievements and comprises a chapter each describing his contributions to number 
theory, logarithms, infinite series, analytic number theory, complex variables, algebra, 
geometry and combinatorics. 
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W. Dunham, The Calculus Gallery: Masterpieces from Newton to Lebesgue, Princeton 
University Press (2005). 

Like the same author’s work on Euler, mentioned above, this is really a book about 
mathematics rather than about its history. He selects eleven mathematicians (including 
Newton, Cauchy, Riemann, Weierstrass and Lebesgue) and one mathematical family 
(the Bernoullis) and gives a ten to fifteen page account of their contributions to calculus 
and/or analysis. 

The MacTutor History of Mathematics Archive (http://www.gap-system.org/history/) 
is a superb online resource where you can find short biographies of almost any 
mathematician you can think of (and many more besides). 

7 . Easier Reading 

D. Berlinski, A Tour of the Calculus, W. Heinemann Ltd. (1996). 

Do read this book. It is on the one hand quite serious mathematics, but on the other hand 
it is also an awful lot of fun. Need I say more? 

There is no need to be restricted to reading books. There are also a number of journals 
aimed at teachers and students of mathematics which often contain interesting articles 
about analysis, calculus and related matters and sometimes these don’t require greater 
knowledge or sophistication than you needed to read this book. Those you might 
look at include The American Mathematical Monthly, Mathematical Spectrum, The 
Mathematical Gazette and Mathematics Magazine. 
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Appendix 1 : The Binomial Theorem 


W e’ve used the binomial theorem on several occasions in this book. In this 
appendix we’ll discuss it and sketch a proof. Let x and y be real numbers. 
We want an expression for (x + y) n where n is an arbitrary natural number. 

Now we can do some basic algebra to calculate 

{x + y) 2 = {x +y)(x +y) — x 2 + 2xy + y 2 , 

(x + y ) 3 = {x + y)(x + y) 2 — x 3 + 3 x 2 y + 3 xy 2 + y 3 , 

(x + y) 4 = (x + y)(x + y ) 3 = x 4 + 4 x 3 y + 6 x 2 y 2 + 4xy 3 + y 4 , 

but what about a general formula? We need to expand 

(x + y) n — (x +y)(x +y) ■ ■ ■ (x +y ) . 

^ v ' 

n times 

From the form of the brackets, it is clear that, 1 


(x + y) n = c 0 x n + Cl 


X V + C 2,,X Y + ... + c n _ ln xy n 1 + c, un y’\ 


where c 0 n , c l n , . . . , c n n are natural numbers. So the work required is to identify 
these numbers. We’ll argue in a combinatorial manner. The generic term involving 
x r y n ~ r requires us to extract r of the xs and (n — r) of the ys from the n brackets 
we are multiplying. This is equivalent to the problem of having n containers each 
containing a ball and having to choose r balls from these without replacing balls 
that have been removed. In how many ways can this be done? Well there are 
n ways of choosing the first, n — 1 ways of choosing the second, n — 2 ways of 
choosing the third and we keep going until we get to the last. There are n — (r — 1) 
ways of choosing this one. So altogether the number of ways of choosing the 
balls is 


n(n — 1 )(« — 2) • • • (n — r + 1) = 


n! 

(n — r)l 


1 This is more along the lines of a convincing argument than a fully rigorous proof, for that 

we need the technique of mathematical induction. 
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But these balls have been chosen in a particular order and that should be irrelevant. 
The total number of ways of ordering the r balls is r! and so we conclude that 

nl 

Cr '" = (n — r)!r! ’ 

We usually use the notation (") instead of c r n and these numbers are called 
binomial coefficients. Since (") = ('') = 1 we can succinctly write: 

Theorem A.l (The Binomial Theorem). Ifx andy are arbitrary real numbers and 
n is a natural number: 


(x + yr = J2( n ) xr y"~ r - 

r = o ' ' 

You should check that (") = ( n "j) = n and more generally (") = („" r )- The latter 
allows us to treat x and y symmetrically in the binomial theorem so that we also 
have 


(x+y) n 


E 



y ■ 


As an exercise you should compute (®) = 15 and (®) = 20 and deduce that 
(x + y) 6 — x 6 + 6 x 5 y + 15x 4 y 2 + 20x 3 y 3 + 15x 2 y 4 + 6 xy 5 + y 6 , 


without having to multiply out any brackets. 

The binomial theorem was apparently first discovered in 1664 or 1665 by 
the great Isaac Newton and communicated by him in two letters sent in 1676 to 
Henry Oldenberg who was secretary of the Royal Society. The pattern played by the 
binomial coefficients and displayed below (where the right-hand side indicates the 
binomial expansion that these come from) is called Pascal’s triangle in honour of 
the mathematician and philosopher Blaise Pascal (1623-62). He was apparently 
the first in the Western world to notice that each coefficient appearing in the 
triangle is the sum of the two numbers immediately above it which are placed to 
the right and to the left. 

1 {x+y)° 

1 1 (x+y) 1 

1 2 1 (x+y) 2 

13 3 1 (x+y) 3 

1 4 6 4 1 (x+y) 4 

1 5 10 10 5 1 (x+y) 5 

etc. 
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You are invited to try to prove the formula that expresses this: 



It only requires basic algebra. 

If we take y — 1 in the binomial theorem and cancel the coefficients down as 
far as is possible we get 


( 1 + x )' 1 = £ 

r= 0 



= 1 + nx H — n(n — l)x 2 H — n(n — 1 )(n — 2)x 2 + • • • + x'\ 


Newton was the first to consider what happens when n is allowed to take more 
general values. In fact if c is an arbitrary real number and — 1 < x < 1 , then ( 1 + x) c 
has meaning as the sum of a convergent infinite series - called the binomial series, 
and we have 


(1 + x) c = 1 + cx H — c(c — l)x 2 H — c(c — l)(c — 2)x 2 + ••■ 


In the special case c — —1 we get 

(1 + x)~ l = 1 — X + x 2 — x 3 + ■ ■ ■ 


and this was used in Section 7.2 to obtain Gregory’s series from which we deduced 
a series expansion for n . 


Appendix 2: The Language of Set Theory 


With the exception of Chapter 11, this book has not used set theory at all. Indeed 
for most of the book is wasn’t necessary. However it is an essential tool for going 
further in analysis and a brief account of it might help to make other textbooks 
more accessible. This short appendix gives an introduction to this important area 
of mathematics. 

A set is a mathematical way of representing a collection of ‘objects’. It doesn’t 
matter what these objects are. They could be mundane things or mathematical 
symbols. Let’s begin with an example. The colours of the rainbow are red, orange, 
yellow, green, blue, indigo and violet. We will collect them together as a set which 
we will denote by C for ‘colours’. We write this set 

C = {red, orange, yellow, green, blue, indigo, violet}. 

On the left-hand side of the equals sign we have the name we’ve given our set. On 
the right-hand side we have the list of the elements or members of the set. These 
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are separated by commas. The braces { and } signal the beginning and end of the 
set (respectively). In a set the order in which we write the list is irrelevant. 2 1 have 
chosen to write the rainbow colours in the familiar ordering whereby wavelength 
decreases from left to right but I could just as easily have written e.g. 

C = {yellow, orange, violet, blue, red, indigo, green} . 

It is helpful to have a mathematical way of indicating set membership and the 
symbol e fulfils this role. So if we want to write succinctly that ‘yellow is a member 
of then set C’ we just say yellow e C. But although black is a perfectly good colour, 
it does not appear in the rainbow and so is not a member of C. We denote this by 
black ^ C. 

Set theory is of universal use in mathematics. In analysis (at least at the level 
of this book) we usually want to consider sets whose elements are numbers. For 
example we might want to consider the set of all integers which lie strictly between 
—5 and 2 and this is the set 

S 1 = {-4, -3, -2, -1,0, 1}. 

This set has precisely six elements but many important sets have an infinite 
number of elements. For example, the set of all natural numbers is universally 
denoted by the symbol N. We cannot write this as a full list but we can at least 
indicate how the list starts: 


N = {1,2, 3,4, 5,...}. 

The sets of all integers, rational numbers and real numbers are written Z, Q and 
R. (respectively). We can write 

Z = {..., -3, -2, -1, 0, 1, 2, 3, 4, . . .}. 

Sometimes we can write a set in a succinct way by using a formula or relation 
that generates all the elements in the list. For example consider the set that we 
defined above. We can also write this 


S, = {x e Z; — 4 < x < 1} 

or equivalently 

Sj = {x e Z; -5 < x < 2}. 


The semicolon here stands for ‘such that’ or ‘for which’, so the meaning of the 
right-hand side in the first of these expressions for Sj is precisely that we have the 
set of all integers x for which x lies between —4 and 1. 


2 There is a notion of an ordered set which is important in the axiomatic approach to 
mathematical analysis, but we won't pursue that direction here. 
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The rational numbers can be given a nice description from this point of view: 


x e R; x : 


-.peZ.ijeN 


Relationships between sets are important. We say that a set A is a subset of a 
set B if every element of A is also an element of B. In this case we write ACS. 
The line underneath C indicates that it may well be that A and B are the same 
set. If we know that A C B but that there are elements of B that are not in A then 
we write A C B and say that A is a proper subset of B. The relationship between 
C and C is analogous to that between < and < in the study of inequalities. For 
example, considering the sets we’ve discussed above, we have Sj C Z and 


NcZcQd. 


In Chapter 8 we introduced the complex numbers and we can also write the 
set of all of these as 


C = {x + iy\ x, y e R}, 

where i — *J— 1. As every real number x can be written in the form x + z'O we see 
that RcC. 

Intervals are examples of subsets of R, e.g. [a, b] — {x e R; a < x < b}. 

In many areas of mathematics where set theory is used we may often identify a 
universal set which has the property that all other sets that we consider are subsets 
of it. For much of this book we have been concerned with the subject of real 
analysis and here the universal set is R. We have briefly touched on the subject of 
complex analysis where the universal set is C. For much elementary work on the 
theory of numbers we might take N to be our universal set. 

Now suppose that X is the universal set and that A and B are subsets of X. 
There are useful ways in which we can form new sets from A and B. The most 
basic of these are called the complement, union and intersection. The complement 
is usually denoted A c or A. It is defined to be the set of all elements of X that are 
not in A. 


A c = jxeX;x / A}. 

The union of A and B is written A U B. It is the set of all elements of X which are 
either members of A, members of B or members of both A and B. The intersection 
of A and B is denoted A fl B. It is the set of all elements of X which are members 
of both A and B. 

A simple example may help to clarify these definitions: 

Let X = {1,2,3,4,5,6,7,8,9,10} and define A = {2, 4, 6, 8, 10}, B = 
{3,6, 8,9} and C = {1,6, 10}. 

Then A c = {1, 3, 5, 7, 9},SU C = {1, 3,6, 8, 9, 10} and B n C = {6}. 
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The empty set 0 is defined to be X c . It is a set that contains no elements and it 
can be very useful for expressing that which can never be, e.g. 

0 = {x e N; x is prime, x is even and x >2}. 

As indicated above, the union is defined using the ‘inclusive or’ (i.e. in A or in 
B or in both) rather than the ‘exclusive or’ of ordinary language. Whenever I write 
‘or’ from now on in this section I will mean this inclusive sense. Now suppose 
that Aj, A 2 , . . . , A n are a finite number of subsets of X. Then we can define the 
intersection A j fl A 2 fl • • • fl A„ to be the subset of all elements ofX that are in every 
set Aj where 1 < j < n and the union A t U A 2 U • • • U A„ to be the subset comprising 
those elements of X that are in at least one of the A^s. You should think about how 
the phrase ‘at least one’ is exactly equivalent to being a member of A l or of A 2 
or of etc. Sometimes we use a similar notation to the sigma notation for sums to 
express these unions and intersections, so we have f]/Li A,- = Aj fl A 2 IT • • • fl A n 
and U"=i = A; U A 2 U • • • U A n . These ideas extend to countably or even 
uncountably many sets and the union over the former appears in Theorem 1 1.3.1. 
To see how this works, let’s suppose we have a sequence of sets (A n ). We define 
nr=iA,tobe the subset ofX comprising elements that belong to every member of 
the sequence and U«li to be the subset of X comprising elements that belong 
to at least one member of the sequence. For example if we take X = R. then you 
may want to think about why [0, oo) = — 1, «] and [0, 1] = [JT =2 (in l] 

There is a great deal more that we could say about set theory, but this is a good 
place to stop as we have more than covered all that is needed for the rather minimal 
use that we’ve made of it in the main part of the text. 


Appendix 3: ProoF by Mathematical Induction 


In the introduction I pointed out that I would avoid using proof by mathematical 
induction in this book. This appendix is included for those of you you want 
to know what this is, how it works and what it can do for us. Be aware that 
this technique is indispensable for anyone doing an undergraduate degree that 
involves sophisticated mathematics. 

The context for mathematical induction is a sequence of propositions 
(P n ) which start at some nonnegative integer n 0 . So the sequence begins 

P , P no +i’ Pno+ 2 ’ ^no+3 ’ Quite often we find that n 0 is 1 or 0 but this is not 

essential. It is however crucial for this particular method of proof that the total 
number of propositions is (countably) infinite. Mathematical induction is a device 
that gives a method for simultaneously proving the validity of all the propositions 
P n provided that two steps are carried out successfully: 

Step 1 (The Initial Step.) Prove that P nQ is true. 
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Step 2 (The Inductive Step.) Prove that if for some arbitrary n, P n is true then 
P n+1 is true. 

If steps 1 and 2 can both be carried out successfully then the principle of 
mathematical induction states that P n is true for every n > n 0 . We will take 
for granted that this principle holds. You should be able to see the logic that 
underlies it. Step one tells us that P is true. Now apply step 2 with n — n 0 to 
deduce that P„ 0+1 is true. Then apply step 2 again with n — n 0 + 1 to see that 
P ng+ 2 is true, and so on, ad infinitum. The proposition P n (where n is arbitrary) 
is sometimes called the inductive hypothesis in the context of mathematical 
induction. 

Example A.l: In Chapter 6, we met the famous formula for the sum of the first 
n natural numbers. Two proofs were given of this result, one allegedly due to 
Gauss as a schoolboy and the other using the formula for the sum of an arithmetic 
progression. We’ll now give a proof by mathematical induction. Here n 0 — 1 and 
P„ is the proposition 


1 + 2 + ■■■ + « = -n(n + 1). 

2 

Step 1. When n — 1 the left-hand side of the formula is 1 and the right-hand 
side is yl(l + 1) = 1 so clearly P l holds. 

Step 2. Assume the result holds for some n then 

1 + 2 H h« + « + l = (l+2H f n) + n + 1 

= ^ n(n + 1) + (n + 1) 

=< n+i) G +i ) 

= ^(« + l)(n + 2), 


and so we see that P n+l is true. 

Step 1 and Step 2 have both been verified and so we can assert that the required 
result holds for all natural numbers n by mathematical induction. As an exercise 
you may like to try to use mathematical induction to prove that 

1 + 2 2 + 3 2 H b « 2 = -n(n + 1)(2 n + 1), 

6 

and 1 + 2 3 + 3 3 H b « 3 = 1 n 2 (n + l) 2 . 

4 

In analysis, proof by induction can be a useful tool for establishing inequalities. 
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Example A.2: In Section 5.3 we met the sequence ( a n ) defined by 

a : — 1 and a n+1 = y/l + a n for n = 2,3,4, 

We used a proof by contradiction there to establish the bound a n < 2 for all n. 
Now let’s proceed by induction. Here we again have « 0 = 1 and Step 1 is obvious. 
For Step 2, assume the bound holds for some n. Then a n+1 < + 2 = V 3 < 2. 

So the result holds for n + 1 and so is true for all n by induction. 

Example A.3: Here we return to Problem 6 at the end of Chapter 3. There you 
were invited to prove the inequality ( a + b) 2 < 2 (a 2 + b 2 ) and you were also asked 
to guess what shape the inequality takes when the two real numbers a and b are 
replaced by n real numbers. Here’s the answer 

(flj + «2 + ■ ' ■ + d n ) 2 < Yl(a\ + + ' ' ' + a n)- 

We will prove this by using mathematical induction. Here we take n 0 — 2 and 
observe that Step 1 was exactly what you established in Problem 6 in Chapter 3. 
We have to work a little to carry out the inductive step. Assume that the required 
inequality holds for some n. Then 

{a 1 + a 2 + h a„ + a n+1 ) 2 

= [(flj + a 2 + ■ ■ ■ + a n ) + a n+1 ] 2 

= (flj + d 2 + • • • + Q n )~ + 2a n+1 (a 1 + a 2 + • • • + a n ) + a~ +l 
< n(a\ + + • • • + + 2a n+l a 1 + 2a n+l a 2 + • • • + 2a n+l a n + u“ n+ i- 

Now use the fact that 2a n+1 a ) - < a 2 n+l + a 2 for each 1 <;<«, 3 to find 
{a 1 + a 2 + ■ ■ ■ + a n + a n+1 ) 2 

< n(a j + d 2 + • • • + d 2 j) + na ~ n +i + a 2 + a 2 + • • • + a~ + w n+l 
= (n + 1 )(«j + «2 + • • • + + a 2 +i), 

and the required result follows by mathematical induction. 


Appendix 4: The Algebra of Numbers 


Suppose that a, b and c are arbitrary natural numbers (respectively, integers, 
rational numbers, real numbers, complex numbers) then the sum a + b and 


3 This follows from the fact that (o n+1 - o ; ) 2 > 0 for 1 < j < n and was already used in 
Problem 6 of Chapter 3. 
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product ab are also natural numbers (respectively, integers, rational numbers, 
real numbers, complex numbers) and the following always hold 

• Commutative Law of Addition 

a+ b — b + a, 

• Associative Law of Addition 

{a T b) 4“ c — a 4- (b 4- c), 

• Commutative Law of Multiplication 

ab = ba, 

• Associative Law of Multiplication 

a(bc) — ( ab)c , 

• Distributive Law of Multiplication Over Addition 

a{b + c) — ab + ac. 

The integers (respectively, rational numbers, real numbers, complex numbers) 
have the property that each element a has an additive inverse —a, i.e. 

a + (— a) = (—a) + a = 0, 

and the set of all integers (respectively, rational numbers, real numbers, complex 
numbers) is an example of an important algebraic structure called a ring. The 
rational numbers (respectively real numbers, complex numbers) have the property 
that each non-zero element a has a multiplicative inverse 1, i.e. 

1 1 

a.— = — .u = 1 , 
u u 

and the set of all rational numbers (respectively real numbers, complex numbers) 
is an example of another important algebraic structure called a field. The sets of 
rational numbers and real numbers (but not the set of complex numbers) are 
both provided with an order relation < and these are called ordered fields. Both 
the sets of real numbers and the complex numbers (but not the set of rational 
numbers) have the completeness property (see Chapter 11) that every Cauchy 
sequence converges to a limit therein. The set of real numbers is the unique 
complete ordered field. If you are curious to know what a ring or a field is then 
you can search for these concepts on Wikipedia, but it’s even better to read an 
introductory text on abstract algebra. 
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Hints and Solutions to Selected Exercises 


Chapter 1 


Exercise 1 .3 Write m = 2k — 1 and n — 21. 

Exercise 1 A Add together m, m + 1, m + 2 and m + 3 and show that the answer 
is of the form 2 p. 


Exercise 1.5 (a) If n is odd then we can write n = 2p — 1 but either p is even 
( p — 2m) or p is odd ( p — 2m +1). Every odd number greater than 

3 of the form Am — 1 can be written in the form Ak + 3 by writing 

m — k + 1. 


Exercise 1.7 (a) 


But for 


p = 2, 2 1 - 1 = 3 

p = 3, 2 3 - 1 = 7 

p = 4, 2 4 - 1 = 15 = 3 

p = 5, 2 5 — 1 = 31 


which is prime 
which is prime 
x 5 which is composite 
which is prime 


(b) 


In factp = 7, 13, 17 and 19 also yield Mersenne primes, but as 
of October 2009 only 47 of these have been found, the largest of 
which corresponds to p = 43112609 (see http://en.wikipedia. 
org/wiki(Mersenne_prime). 


n = 0, 

2 2 ° + 1 = 3 

which is prime 

n = 1, 

2 2 ‘ + 1 = 5 

which is prime 

n = 2, 

2 22 + 1 = 17 = 3 

which is prime 

n = 3, 

2 23 + 1 = 257 

which is prime 

n = 4, 

2 24 + 1 = 65537 

which is prime. 


No other Fermat numbers are known. When n = 5, 2 2? + 1 = 
4294967297 = 641 x 6700417. 


Exercise 1 .8 Every number between 1 and 40 (inclusive) generates a prime 
number but n — 41 yields a number that is composite. 


HINTS AND SOLUTIONS TO SELECTED EXERCISES 


Chapter 2 


Exercise 2.1 0.1101 
Exercise 2.2 0.076923 
Exercise 2.3 (b) |||| 

Exercise 2.4 First show that 5^ = and 5^ 
Exercise 2.5 (a) 1296 = 36 2 . 

Exercise 2.6 e.g. * = \[l andy = -4=. 
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Exercise 2.9 The answer is yes and here is a ‘non-constructive’ proof, sfl ~ is 
either rational or irrational. If it is rational choose a = b = \fl and 

a/2 

if it is irrational choose a — and b = \fl. [In fact in can be 

V^2 

shown that \fl ~ is irrational but this needs some very advanced 
techniques.] 


Exercise 2.10 It is irrational as it is not periodic or even eventually so. 

For an example of an appropriate rational number take 
0.12345678910120 

and for an appropriate irrational number: 
0.1234567891011242628303234 


Chapter 3 


Exercise 3.2 
Exercise 3.3 
Exercise 3. A 
Exercise 3.7 


bd — ac = bd — be + be — ac = b(d — c) + c[b — a) > 0. 
x < — 3 or — 1 < x < 2. 

0 < x < 2. 

(1 + x) r = 1 + rx + |r(r — l)x 2 + | r(r — l)(r — 2)x 3 H x r > 

1 + rx. 


Chapter 4 


Exercise 4.1 Limit is 1. (a) (i) 30, (ii) 300, (iii) 3000, (iv) 30000, (v) 30000000000. 

(b) Take n 0 to be the smallest natural number so that n 0 + 1 
exceeds |. 
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Exercise 4.2 Limits are (a) 1, (b) 0, (c) 0, (d) 0. 


Exercise 4.4 Use | — |x|| < \x n — x\. Converse is false - e.g consider (— 1)”. 

Exercise 4.6 Limits are (a) 6, (b) 1, (c) |, (d) (e) 0. 

Hint for (e) - multiply top and bottom by y/n + 1 + yfn. 


Exercise 4.8 (b) We must show that i < € for all n > N. But this means that 
n > - for such n. Suppose that such an N can be found and choose 
e — Now take n — N + 1. What do you find? 

Exercise 4.11 c n — a n > b„ — a n > 0. 


Exercise 4.13 


Ifx n -* l then x n+1 -* l as n -* oo. (i) 1, (ii) i. 


Exercise 4.1 5 Let (a ) be a subsequence. Given e > 0 there exists N such that 
if n > N then \a n — l\ < e. Choose a natural number R to be such 
that n r > N whenever r > R and the result follows. 


Chapter 5 


Exercise 5.2 (a) sup 3, inf —2, increasing to 3, (b) sup 2, inf 1, decreasing to 1, 
(c) inf 2, increasing, diverges to +oo, (d) inf —1, sup | , oscillates 
finitely, (e) inf 0, sup 2, converges to 0. 


Exercise 5.3 (a) Write A = sup(a n ) andB = sup (b n ). As a n < A and b n < B for 
all n it follows from Exercise 3.2 that a n b n < AB for all n. So AB is 
an upper bound for the sequence (a n b n ) and so AB > sup {a n b n ). 
A counter-example is a„ = 1 + -, b n = 1 — 

Exercise 5.4 inf(ajinf(b„) < inf {a n b n ). A counter-example for the first 
inequality in the question is as in Exercise 5.3 (a) above, for the 
second take a„ = n,b„ = 

n n n 


Exercise 5.6 The sequence is monotonic decreasing to 2. 

Exercise 5.7 (a) The Theorem of the Means is a useful tool, (c) Deduce that 
(a„) is monotonic decreasing and bounded below and that ( b n ) is 
monotonic increasing and bounded above. T o show that they have 
the same limit, first establish that a n+1 — b n+l < 2 - b). 


Exercise 5. 1 0 (a) By the result of (8) a nk < l for all k. Now for any n k < n < n k+l 
we have a n < a nk+l < l and the result follows, (b) Given any € > 0 
we can find K such that if k > K then l — e < a„, < l + e. Then 
for n > n k ,l — e < a nf _ < a n < l < 1 + e and the result follows. 

Exercise 5.12 (i) limsup 1, liminf — 1, (ii) lim sup 0, lim inf 0, (iii) limsupl, 
lim inf — 1 . 
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HINTS AND SOLUTIONS TO SELECTED EXERCISES 


Chapter 6 


Exercise 6.1 


Exercise 6.3 


Exercise 6. A 
Exercise 6.5 
Exercise 6.6 
Exercise 6.7 

Exercise 6.8 


Exercise 6.1 0 
Exercise 6.1 5 


Exercise 6. 1 6 


Exercise 6.1 7 
Exercise 6. 1 9 


(a) Converges to 150, (b) diverges, (c) converges to 2, (d) converges 
to |, (e) converges to 1 _ si 1 n(fl) for all values of 9 except for | and 
where the series diverges. 

oo N— 1 oo 

Using (6.7.14) J2 a n = a n + a n an d then by algebra of 

n= 1 n= 1 n—N 

limits, 


oo / oo N— 1 \ oo oo 

E ( E fl » - E a n ) = E a n ~ E = °- 

n—N \ n—\ n= 1 / n=l n=l 

(a)|. 

(a) Does lim^^ a n = 0 in either (i) or (ii)? 

(a), (b), (d) converge - (c), (e) diverge. 

(a), (b), (c) all converge (for (c) note that this is because t < 1), 
(d) diverges. 

(a) converges (ratio test if you know about e or root test, for 
which see Exercise 6.13) (b), (c) diverge (both comparison test), 
(d) converges (ratio test). 

Use the Theorem of the Means to show that 

1 — 2 ( a n a n+ 1)- 

(a) All three series are convergent by the Leibniz test. (b)(i) is 
not absolutely convergent, but (ii) and (iii) both are - you should 
use the comparison test in all three cases. For (ii), note first that 
for n > 2, 3n 2 — 2 n > 3n 2 — 6 n — 3 n(n — 2) > 3(« — 2) 2 so 
i < 1 

3n 2 — 2n 3(«— 2) 2 ’ 

(a) False - e.g a n = ± 

OO 

(b) True. If \a n \ converges then lim^^ a n = 0, hence we can 

n = 0 

find n 0 such that n > n 0 =X \a n \ < 1. But then a 2 < | a n \ for all 
n > n 0 and the result follows from the comparison test. 

e.g. a n = (-irj Tt . 

(a) Since the series converges to s say, then given e > 0 there exists 
N such that if n > N then \s n — s| < f . Then 


-s\ — 


5 i+ 5 2~I 

n 
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< H5 N — Ns s n+1 H hs„ — (« — N)s 

— n n 

< 5 1 +S 2 H \-S n -Ns |s m+1 -s| ^ |s M+2 -s| 

— n n n 

+ Ivifl 

n 

< Sj +s 2 4 |-s N — Ns (« — N)e 

— n 2« 

•Si +5 2 H hs N — Ns e 

< - — - h-, 

n 2 

and the result follows since lim,,^^ - = 0 and so for sufficiently 
large n, | >1+52 +-+sn-ns | < e y OU s ]- 10u ] ( j fill j n tfi e details 
to make this precise), (b) s„ — |(1 — (—1)")- If n is even then 
s' n = n (§) = \ and if n is odd, s' n — I (^) -* | as « — > oo,from 
which the result follows. 
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